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Inspired by empirical studies of networked systems such as the Internet, social networks, and bio- 
logical networks, researchers have in recent years developed a variety of techniques and models to 
help us understand or predict the behavior of these systems. Here we review developments in this 
field, including such concepts as the small-world effect, degree distributions, clustering, network 
correlations, random graph models, models of network growth and preferential attachment, and 
dynamical processes taking place on networks. 
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I. INTRODUCTION 

A network is a set of items, which we will call vertices 
or sometimes nodes, with connections between them, 
called edges (Fig. 0. Systems taking the form of net- 
works (also called "graphs" in much of the mathematical 
literature) abound in the world. Examples include the In- 
ternet, the World Wide Web, social networks of acquain- 
tance or other connections between individuals, organi- 
zational networks and networks of business relations be- 
tween companies, neural networks, metabolic networks, 
food webs, distribution networks such as blood vessels 
or postal delivery routes, networks of citations between 
papers, and many others (Fig.EJ. This paper reviews re- 
cent (and some not-so-recent) work on the structure and 
function of networked systems such as these. 

The study of networks, in the form of mathematical 
graph theory, is one of the fundamental pillars of dis- 
crete mathematics. Euler's celebrated 1735 solution of 
the Konigsberg bridge problem is often cited as the first 
true proof in the theory of networks, and during the twen- 
tieth century graph theory has developed into a substan- 
tial body of knowledge. 

Networks have also been studied extensively in the so- 
cial sciences. Typical network studies in sociology involve 
the circulation of questionnaires, asking respondents to 
detail their interactions with others. One can then use 
the responses to reconstruct a network in which vertices 
represent individuals and edges the interactions between 
them. Typical social network studies address issues of 
centrality (which individuals are best connected to others 
or have most influence) and connectivity (whether and 
how individuals are connected to one another through 
the network). 

Recent years however have witnessed a substantial new 
movement in network research, with the focus shifting 
away from the analysis of single small graphs and the 
properties of individual vertices or edges within such 
graphs to consideration of large-scale statistical proper- 
ties of graphs. This new approach has been driven largely 
by the availability of computers and communication net- 
works that allow us to gather and analyze data on a 
scale far larger than previously possible. Where stud- 
ies used to look at networks of maybe tens or in extreme 
cases hundreds of vertices, it is not uncommon now to see 
networks with millions or even billions of vertices. This 
change of scale forces upon us a corresponding change in 




FIG. 1 A small example network with eight vertices and ten 
edges. 



our analytic approach. Many of the questions that might 
previously have been asked in studies of small networks 
are simply not useful in much larger networks. A social 
network analyst might have asked, 'Which vertex in this 
network would prove most crucial to the network's con- 
nectivity if it were removed?" But such a question has 
little meaning in most networks of a million vertices — no 
single vertex in such a network will have much effect at all 
when removed. On the other hand, one could reasonably 
ask a question like, "What percentage of vertices need to 
be removed to substantially affect network connectivity 
in some given way?" and this type of statistical question 
has real meaning even in a very large network. 

However, there is another reason why our approach 
to the study of networks has changed in recent years, a 
reason whose importance should not be underestimated, 
although it often is. For networks of tens or hundreds 
of vertices, it is a relatively straightforward matter to 
draw a picture of the network with actual points and lines 
(Fig. |2J) and to answer specific questions about network 
structure by examining this picture. This has been one of 
the primary methods of network analysts since the field 
began. The human eye is an analytic tool of remarkable 
power, and eyeballing pictures of networks is an excellent 
way to gain an understanding of their structure. With 
a network of a million or a billion vertices however, this 
approach is useless. One simply cannot draw a mean- 
ingful picture of a million vertices, even with modern 3D 
computer rendering tools, and therefore direct analysis 
by eye is hopeless. The recent development of statistical 
methods for quantifying large networks is to a large ex- 
tent an attempt to find something to play the part played 
by the eye in the network analysis of the twentieth cen- 
tury. Statistical methods answer the question, "How can 
I tell what this network looks like, when I can't actually 
look at it?" 

The body of theory that is the primary focus of this 
review aims to do three things. First, it aims to find sta- 
tistical properties, such as path lengths and degree distri- 
butions, that characterize the structure and behavior of 
networked systems, and to suggest appropriate ways to 
measure these properties. Second, it aims to create mod- 
els of networks that can help us to understand the mean- 
ing of these properties — how they came to be as they are, 
and how they interact with one another. Third, it aims 
to predict what the behavior of networked systems will 
be on the basis of measured structural properties and the 
local rules governing individual vertices. How for exam- 
ple will network structure affect traffic on the Internet, or 
the performance of a Web search engine, or the dynamics 
of social or biological systems? As we will see, the scien- 
tific community has, by drawing on ideas from a broad 
variety of disciplines, made an excellent start on the first 
two of these aims, the characterization and modeling of 
network structure. Studies of the effects of structure on 
system behavior on the other hand are still in their in- 
fancy. It remains to be seen what the crucial theoretical 
developments will be in this area. 
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FIG. 2 Three examples of the kinds of netw orks that are the topic of this review, (a) A food web of predator-prey interactions 
between species in a freshwater lake I272H . Picture courtesy of Neo Martinez and Richard Williams, (b) The network of 
collaborations between scientis ts at a private research institution |l7ll |. (c) A network of sexual contacts between individuals 
in the study by Potterat et al. |342j| . 



A. Types of networks 

A set of vertices joined by edges is only the simplest 
type of network; there are many ways in which networks 
may be more complex than this (Fig. For instance, 
there may be more than one different type of vertex in a 
network, or more than one different type of edge. And 
vertices or edges may have a variety of properties, nu- 
merical or otherwise, associated with them. Taking the 
example of a social network of people, the vertices may 
represent men or women, people of different nationalities, 
locations, ages, incomes, or many other things. Edges 
may represent friendship, but they could also represent 
animosity, or professional acquaintance, or geographical 
proximity. They can carry weights, representing, say, 
how well two people know each other. They can also be 
directed, pointing in only one direction. Graphs com- 
posed of directed edges are themselves called directed 



graphs or sometimes digraphs, for short. A graph rep- 
resenting telephone calls or email messages between in- 
dividuals would be directed, since each message goes in 
only one direction. Directed graphs can be either cyclic, 
meaning they contain closed loops of edges, or acyclic 
meaning they do not. Some networks, such as food webs, 
are approximately but not perfectly acyclic. 

One can also have hyperedges — edges that join more 
than two vertices together. Graphs containing such edges 
are called hypergraphs. Hyperedges could be used to in- 
dicate family ties in a social network for example — n in- 
dividuals connected to each other by virtue of belonging 
to the same immediate family could be represented by 
an n-edge joining them. Graphs may also be naturally 
partitioned in various ways. We will see a number of 
examples in this review of bipartite graphs: graphs that 
contain vertices of two distinct types, with edges running 
only between unlike types. So-called affiliation networks 
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FIG. 3 Examples of various types of networks: (a) an undi- 
rected network with only a single type of vertex and a single 
type of edge; (b) a network with a number of discrete ver- 
tex and edge types; (c) a network with varying vertex and 
edge weights; (d) a directed network in which each edge has 
a direction. 

in which people are joined together by common member- 
ship of groups take this form, the two types of vertices 
representing the people and the groups. Graphs may also 
evolve over time, with vertices or edges appearing or dis- 
appearing, or values defined on those vertices and edges 
changing. And there are many other levels of sophistica- 
tion one can add. The study of networks is by no means 
a complete science yet, and many of the possibilities have 
yet to be explored in depth, but we will see examples of 
at least some of the variations described here in the work 
reviewed in this paper. 

The jargon of the study of networks is unfortunately 
confused by differing usages among investigators from 
different fields. To avoid (or at least reduce) confusion, 
we give in Table [I] a short glossary of terms as they are 
used in this paper. 



B. Other resources 

A number of other reviews of this area have appeared 
recently, which the reader may wish to consult. A lbert 
and Barabasi and Dorogovtsev and Mendes |l20j 
have given extensive pedagogical reviews focusing on the 
physics literature. Both devote the larger part of their at- 
tention to the models of growing graphs that we describe 
in Sec. lVIII Shorter revi ews t aking other viewpoint s have 
been given by Newman |309j and Hayes |l89l Il90| , who 
both concentrate on the so-cal led " small-world" models 
(see Sec. lVI|) . and by Strogatz |387| . who includes an in- 
teresting discussion of the behavior of dynamical systems 
on networks. 

A number of books also ma ke worthwhile reading. 
Dorogovtsev and Mendes |l22j have expanded their 
above-mentioned review into a book, which again fo- 
cuses on models of growing graphs. The edited volumes 
by Bornholdt and Schuster [7(j and by Pastor-Satorras 



and Rubi 330J both contain contributed essays on var- 
ious topics by leading researchers. Detailed treatments 
of many of the topics covered in the presen t wo rk can be 
found there. The book by Newman et al. |32fJ | is a col- 
lection of previously published papers, and also contains 
some review material by the editors. 

Three popular books on the subject of networks merit 
a mention. Albert-Laszlo Barabasi's Linked |31| gives 
a personal account of recent developments in the study 
of networks, focusing particularly on Barabasi's wo rk on 
scale-free networks. Duncan Watts's Six Degrees |414| 
gives a sociologist's view, partly historical, of discoveries 
old and new. Mark Buchanan's Nexus [76j gives an en- 
tertaining portrait of the field from the point of view of 
a science journalist. 

Farther afield, there are a variety of books on the study 
of networks in particular fields. Within graph theory the 
books by Harary |l88j | and by Bollobas |63 are widely 
cited and among social network theoris ts th e books by 
Wasserman and Faust |409j and by Scott [363] . The book 
by Ahuja et al. is a useful source for information on 
network algorithms. 



C. Outline of the review 

The outline of this paper is as follows. In Sec.[n]we de- 
scribe empirical studies of the structure of networks, in- 
cluding social networks, information networks, technolog- 
ical networks and biological networks. In Sec. IHII we de- 
scribe some of the common properties that are observed 
in many of these networks, how they are measured, and 
why they arc believed to be important for the functioning 
of networked systems. Sections IIVI to IVTT1 form the heart 
of the review. They describe work on the mathematical 
modeling of networks, including random graph models 
and their generalizations, exponential random graphs, 
p* models and Markov graphs, the small-world model 
and its variations, and models of growing graphs includ- 
ing preferential attachment models and their many vari- 
ations. In Sec. I Villi we discuss the progress, such as it 
is, that has been made on the study of processes taking 
place on networks, including epidemic processes, network 
failure, models displaying phase transitions, and dynam- 
ical systems like random Boolean networks and cellular 
automata. In Sec. IIXI we give our conclusions and point 
to directions for future research. 



II. NETWORKS IN THE REAL WORLD 

In this section we look at what is known about the 
structure of networks of different types. Recent work 
on the mathematics of networks has been driven largely 
by observations of the properties of actual networks and 
attempts to model them, so network data are the ob- 
vious starting point for a review such as this. It also 
makes sense to examine simultaneously data from dif- 
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Vertex (pi. vertices): The fundamental unit of a network, also called a site 
(physics), a node (computer science), or an actor (sociology). 

Edge: The line connecting two vertices. Also called a bond (physics), a link 
(computer science), or a tie (sociology). 

Directed/undirected: An edge is directed if it runs in only one direction (such 
as a one-way road between two points), and undirected if it runs in both directions. 
Directed edges, which are sometimes called arcs, can be thought of as sporting arrows 
indicating their orientation. A graph is directed if all of its edges are directed. An 
undirected graph can be represented by a directed one having two edges between each 
pair of connected vertices, one in each direction. 

Degree: The number of edges connected to a vertex. Note that the degree is not 
necessarily equal to the number of vertices adjacent to a vertex, since there may be 
more than one edge between any two vertices. In a few recent articles, the degree 
is referred to as the "connectivity" of a vertex, but we avoid this usage because the 
word connectivity already has another meaning in graph theory. A directed graph 
has both an in-degree and an out-degree for each vertex, which are the numbers of 
in-coming and out-going edges respectively. 

Component: The component to which a vertex belongs is that set of vertices 
that can be reached from it by paths running along edges of the graph. In a directed 
graph a vertex has both an in-component and an out-component, which are the sets 
of vertices from which the vertex can be reached and which can be reached from it. 

Geodesic path: A geodesic path is the shortest path through the network from 
one vertex to another. Note that there may be and often is more than one geodesic 
path between two vertices. 

Diameter: The diameter of a network is the length (in number of edges) of the 
longest geodesic path between any two vertices. A few authors have also used this 
term to mean the average geodesic distance in a graph, although strictly the two 
quantities are quite distinct. 



TABLE I A short glossary of terms. 



ferent kinds of networks. One of the principal thrusts 
of recent work in this area, inspired particularly by a 
groundbreaking 1998 paper by Watts and Strogatz |416j . 
has been the comparative study of networks from dif- 
ferent branches of science, with emphasis on properties 
that are common to many of them and the mathematical 
developments that mirror those properties. We here di- 
vide our summary into four loose categories of networks: 
social networks, information networks, technological net- 
works and biological networks. 



A. Social networks 

A social network is a set of people or groups of peo- 
ple with so me patter n of contacts or interactions be- 
tween them |363l 14091 . The patterns of friendships be- 
tween individuals I296L l348l . business relationships be- 
tween c ompa nies [2690286^ and intermarriages between 
families |327j are all examples of networks that have been 
studied in the past. 1 Of the academic disciplines the so- 



cial sciences have the longest history of t he substa ntial 
quantitative study of real-world networks 102. 363j. Of 
particular note among the early works on the subject are: 
Jacob Moreno's work in the 1920s and 30s on friend- 
ship patterns within small groups |296| : the s o-called 
"southern women study" of Davis et al. |103| . which 
focused on the social circles of women in an unnamed 
city in the American south in 1936; the study by El- 
ton Mayo and colleagues of social ne twor ks of factory 
workers in the late 1930s in Chicag o l357ll : the mathe- 
matical models of Anatol Rapoport 346], who was one 
of the first theorists, perhaps the first, to stress the im- 
portance of the degree distribution in networks of all 
kinds, not just social networks; and the studies of friend- 
ship networks of school children by Rapoport and oth- 
ers |l49ll348t . In more recen t years, studies of business 
communities Il67. Il68l 12691 and o f patterns of sexual 
contacts |4l l218t l243t 1266 1 1303 L 1342 ] have attracted par- 
ticular attention. 

Another important set of experiments are the famous 



Occasionally social networks of animals have been investigated 
also, such as dolphins l96l . not to mention networks of fictional 



characters, such as the protagonists of Tolstoy's Anna Karen- 
ina L'44l or Marvel Comics superheroes 1 1(1 . 
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"small- world" experiments of Milgram |283t l393| . No ac- 
tual networks were reconstructed in these experiments, 
but nonetheless they tell us about network structure. 
The experiments probed the distribution of path lengths 
in an acquaintance network by asking participants to pass 
a letter 2 to one of their first-name acquaintances in an at- 
tempt to get it to an assigned target individual. Most of 
the letters in the experiment were lost, but about a quar- 
ter reached the target and passed on average through the 
hands of only about six people in doing so. This exper- 
iment was the origin of the popular concept of the "six 
degrees of separation," although that phrase did not ap- 
pear in Milgra m's w riting, being coined some decades 
later by Guare |l83l ] . A brief but useful early review of 
Milgram' s wo rk and work stemming from it was given by 
Garfield psf . 

Traditional social network studies often suffer from 
problems of inaccuracy, subjectivity, and small sample 
size. With the exception of a few ingenious indirect 
studies such as Milgram's, data collection is usually car- 
ried out by querying participants directly using question- 
naires or interviews. Such methods are labor-intensive 
and therefore limit the size of the network that can be 
observed. Survey data are, moreover, influenced by sub- 
jective biases on the part of respondents; how one re- 
spondent defines a friend for example could be quite dif- 
ferent from how another does. Although much effort is 
put into eliminating possible sources of inconsistency, it 
is generally accepted that there are large and essentially 
uncontrolled errors in most of these st udies . A review of 
the issues has been given by Marsden |27f| . 

Because of these problems many researchers have 
turned to other methods for probing social networks. 
One source of copious and relatively reliable data is col- 
laboration networks. These are typically affiliation net- 
works in which participants collaborate in groups of one 
kind or another, and links between pairs of individuals 
are established by common group membership. A classic, 
though rather frivolous, example of such a network is the 
collaboration network of film actors, which is thoroughly 
documented in the online Internet Movie Database. 3 In 
this network actors collaborate in films and two actors 
are considered connected if they have appeared in a film 
together. Statistical properties of this n etwork ha ve been 
analyzed by a number of authors [j, |2(J, 13231 l416j . Other 
examples of networks of this type are networks of com- 
pany directors, in which two directors a re linked i f they 
belong to the same board of directors |l04l Il05l |269| . 
networks of coauthorship among academics, in which in- 
dividuals are linke d if they have coauthored one or more 
papers [M H [M [H Eli HH HU IM |H3 , and 
coappearance networks in which individuals are linked 
by mention in the same context, particularly on Web 



2 Actually a folder containing several documents. 

3 http://www.imdb.com/ 



pages [3L l227j or in newspaper articles [9^ (see Fig. Et>)- 
Another source of reliable data about personal connec- 
tions between people is communication records of cer- 
tain kinds. For example, one could construct a network 
in which each (directed) edge between two people rep- 
resented a letter or package sent by mail from one to 
the other. No study of such a network has been pub- 
lished as far as we are aware, but some similar things 
have. Aiello et al. H @ have analyzed a network of 
telephone calls made over the AT&T long-distance net- 
work on a single day. The vertices of this network repre- 
sent telephone numbers and the directed edges calls from 
one number to another. Even for just a single day this 
graph is enormous, having about 50 million vertices, one 
of the largest graphs yet studie d af ter the graph of the 
World Wide Web. Ebel et al. |l3fil | have reconstructed 
the pattern of email communications between five thou- 
sand students at Kiel University from logs maintained 
by email servers. In this network the vertices repre- 
sent email addresses and directed edges represent a mes- 
sage passing from one address to another. Email net- 
works have also been studi ed by Newman et al. |32l| 
and by Guimera et al. 185], and similar networks have 
been c onstr ucted for an "instant messaging" system by 
Smith |37l| . and f or an Internet c omm unity Web site by 
Holme et al. |l96l| . Dodds et al. |110| have carried out 
an email version of Milgram's small-world experiment in 
which participants were asked to forward an email mes- 
sage to one of their friends in an effort to get the message 
ultimately to some chosen target individual. Response 
rates for the experiment were quite low, but a few hun- 
dred completed chains of messages were recorded, enough 
to allow various statistical analyses. 



B. Information networks 

Our second network category is what we will call in- 
formation networks (also sometimes called "knowledge 
networks"). The classic example of an information net- 
work isthe network of citations between academic pa- 
pers |l38j . Most learned articles cite previous work by 
others on related topics. These citations form a network 
in which the vertices are articles and a directed edge from 
article A to article B indicates that A cites B. The struc- 
ture of the citation network then reflects the structure of 
the information stored at its vertices, hence the term "in- 
formation network," although certainly there are social 
aspects to the citation patterns of papers too |420| . 

Citation networks are acyclic (see Sec. II.Ajl because 
papers can only cite other papers that have already been 
written, not those that have yet to be written. Thus all 
edges in the network point backwards in time, making 
closed loops impossible, or at least extremely rare (see 
Fig.©. 

As an object of scientific study, citation networks have 
a great advantage in the copious and accurate data avail- 
able for them. Quantitative study of publication patterns 
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citation network 



World-Wide Web 



FIG. 4 The two best studied information networks. Left: the 
citation network of academic papers in which the vertices are 
papers and the directed edges are citations of one paper by 
another. Since papers can only cite those that came before 
them (lower down in the figure) the graph is acyclic — it has 
no closed loops. Right: the World Wide Web, a network of 
text pages accessible over the Internet, in which the vertices 
are pages and the directed edges are hyperlinks. There are 
no constraints on the Web that forbid cycles and hence it is 
in general cyclic. 



stretches back at least as far as Alfred Lotka's ground- 
breaking 1926 discovery of the so-called Law of Scien- 
tific Productivity, which states that the distribution of 
the numbers of papers written by individual scientists 
follows a power law. That is, the number of scientists 
who have written k papers falls off as k~ a for some con- 
stant a. (In fact, this result extends to the arts and 
humanities as well.) The first serious work on citation 
patterns was conducted in the 1960s as large citation 
databases became available through the work of Eugene 
Garfield and other pioneers in the field of bibliometrics. 
The network formed by citations was discussed in an 
early paper by Price |343j . in which among other things, 
the author points out for the first time that both the in- 
and out-degree distributions of the network follow power 
laws, a far-reaching discovery which we discuss further 
in Sec. IIII.CI Many other studies of citation networks 
have been performed since then, using the ever better 
resources available in citation data bases. Of p articu lar 
note are the studies by Seglen |364j and Redner |35l| . 4 

Another very important example of an information 
network is the World Wide Web, which is a network of 
Web pages containing information, link ed together by hy- 
perlinks from one page to another |203j . The Web should 
not be confused with the Internet, which is a physical net- 
work of computers linked together by optical fibre and 



4 An interesting development in the study of citation pat- 
terns has been the arrival of automatic citation "crawlers" 
that construct citation networks from online papers. Exam- 
ples include Citeseer (http://citeseer.nj.nec.com/), SPIRES 
(http://www.slac.stanford.edu/spires/hep/) and Citebase 
(http: //citebase . eprints . org/). 



other data connections. 5 Unlike a citation network, the 
World Wide Web is cyclic; there is no natural ordering 
of sites and no constraints that prevent the appearance 
of closed loops (Fig. H}. The Web has been very heavily 
studied since its first appearance in the early 1990s, with 
the studies by Albert et al. [3, UK, Kleinberg et al. |241| . 
and Broder et al. [74| being particularly influential. The 
Web also appears to have power-law in- and out-degree 
distributions (Sec. IIII. CI. a s w ell as a variet y of other 
interesting properties @,'[H |71 EH HH EH) . 

One important point to notice about the Web is that 
our data about it come from "crawls" of the network, in 
which Web pages are found by following hyperlinks from 
other pages [74). Our picture of the network structure 
of the World Wide Web is therefore necessarily biased. 
A page will only be found if another page points to it, 6 
and in a crawl that covers only a part of the Web (as all 
crawls do at present) pages are more l ikely to be found 
the more other pages point to them |263| . This sug- 
gests for instance that our measurements of the fraction 
of pages with low in-degree might be an underestimate. 7 
This behavior contrasts with that of a citation network. 
A paper can appear in the citation indices even if it has 
never been cited (and in fact a plurality of papers in the 
indices are never cited). 

A few other examples of information networks have 
been stud ied to a lesser extent. Jaffe and Trajten- 
berg [207], for instance, have studied the network of ci- 
tations between US patents, which is similar in some re- 
spects to citations between academic papers. A number 
of a uthors have looked at peer-to-peer networks @, 0, 
|205| . which are virtual networks of computers that al- 
low sharing of files between computer users over local- 
or wide- area networks. The network of relations be- 
tween word classes in a thesaurus has been studied by 
Knuth E9 a nd m ore recently by various other au- 
thors |234l304l384j . This network can be looked upon as 
an information network — users of a thesaurus "surf" the 
network from one word to another looking for the par- 
ticular word that perfectly captures the idea they have 
in mind. However, it can also be looked at as a concep- 
tual network representing the structure of the language, 
or possibly even the mental constructs used to represent 
the language. A number of o ther semantic word networks 
have also been investigated [Til l H57L 13691 1384| . 

Preference networks provide an example of a bipartite 



5 While the Web is primarily an information network, it, like cita- 
tion networks, has social aspects to its structure also [J. 

6 This is not always strictly true. Some Web search engines allow 
the submission of pages by members of the public for inclusion in 
databases, and such pages need not be the target of links from 
any other pages. However, such pages also form a very small 
fraction of all Web pages, and certainly the biases discussed here 
remain very much present. 

7 The degree distribution for the Web shown in Fig. [5] falls off 
slightly at low values of the in-degree, which may perhaps reflect 
this bias. 
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information network. A preference network is a network 
with two kinds of vertices representing individuals and 
the objects of their preference, such as books or films, 
with an edge connecting each individual to the books or 
films they like. (Preference networks can also be weighted 
to indicate strength of likes or dislikes.) A widely stud- 
ied example of a preference network is the EachMovie 
database of film preferences. 8 Networks of this kind form 
the basis for collaborative filtering algorithms and recom- 
mender systems, which are techniques for predicting new 
likes or dislikes based on co mparison of in dividuals' pref- 
erences with those of others [lHll5l|367|. Collaborative 
filtering has found considerable commercial success for 
product recommendation and targeted advertising, par- 
ticularly with online retailers. Preference networks can 
also be thought of as social networks, linking not only 
people to objects, but also people to other people with 
similar preferences. This approach has been adopted oc- 
casionally in the literature |227| . 



C. Technological networks 

Our third class of networks is technological networks, 
man-made networks designed typically for distribution 
of some commodity or resource, such as electricity or in- 
formation. The electric power grid is a good example. 
This is a network of high-voltage three-phase transmis- 
sion lines that spans a country or a portion of a coun- 
try (as opposed to the local low-voltage a.c. power deliv- 
ery lines that span individual neighborhoods). Statistical 
studies of power grid s have bee n made by, for example, 
Watts and Strogatz HH EH and Amaral et al. [2p|. 
Other distribution networks that have been studied in- 
clude the net work of ai rline route s poj . and networks 
of roads |22l| . railways |262l l366j and pedestrian traf- 
fic (8tJ . River networks could be regarded as a naturally 
occurring form of d i stribution netw ork (actually a collec- 
tion network) [TH EH EH [356], as could the vascu- 
lar networks discussed in Sec. III.DI The telephone net- 
work and delivery networks such as those used by the 
post-office or parcel delivery companies also fall into this 
general category and are presumably studied within the 
relevant corporations, if not yet by academic researchers. 
(We distinguish here between the physical telephone net- 
work of wires and cables and the network of who calls 
whom, discussed in Sec. III. Al l Electronic circuits |155| 
fall somewhere between distribution and communication 
networks. 

Another very widely studied technological network is 
the Internet, i.e., the network of physical connections 
between computers. Since there is a large and ever- 
changing number of computers on the Internet, the struc- 
ture of the network is usually examined at a coarse- 



http : //research . Compaq. com/SRC/eachmovie/ 



grained level, either the level of routers, special-purpose 
computers on the network that control the movement 
of data, or "autonomous systems," which are groups of 
computers within which networking is handled locally, 
but between which data flows over the public Internet. 
The computers at a single company or university would 
probably form a single autonomous system — autonomous 
systems often correspond roughly with domain names. 

In fact, the network of physical connections on the In- 
ternet is not easy to discover since the infrastructure is 
maintained by many separate organizations. Typically 
therefore, researchers reconstruct the network by reason- 
ing from large samples of point-to-point data routes. So- 
called "traceroute" programs can report the sequence of 
network nodes that a data packet passes through when 
traveling between two points and if we assume an edge 
in the network between any two consecutive nodes along 
such a path then a sufficiently large sample of paths will 
give us a fairly complete picture of the entire network. 
There may however be some edges that never get sam- 
pled, so the reconstruction is typically a good, but not 
perfect, representation of the true physical structure of 
the Internet. Studies of Internet structure h ave b een car- 
ried out by, among others, Faloutsos et al. |l4Sf . Broida 
and Claffy [73 and Chen et al. @. 

D. Biological networks 

A number of biological systems can be usefully rep- 
resented as networks. Perhaps the classic example of 
a biological network is the network of metabolic path- 
ways, which is a representation of metabolic substrates 
and products with directed edges joining them if a 
known metabolic reaction exists that acts on a given 
substrate and produces a given product. Most of us 
will probably have seen at some point the giant maps of 
metabolic pathways that many molecular biologists pin 
to their walls. 9 Studies of the statistical properties of 
metabolic ne tworks ha ve been performed by, for exa mple, 
Jeong et al. I2lll340l . Fell and Wagner fl53ll405| . and 
Stclling et al. 383]. A separate network is the network 
of mechanistic physical interactions between proteins (as 
opposed to chemical reactions among metabolites), which 
is usually referred to as a protein interaction network. 
Interacti on networks have been studied by a number of 
authors HE El EH EH EH . 

Another important class of biological network is the 
genetic regulatory network. The expression of a gene, 
i.e., the production by transcription and translation of 
the protein for which the gene codes, can be controlled 
by the presence of other proteins, both activators and 



9 The standard chart of the metabolic network is somewhat mis- 
leading. For reasons of clarity and aesthetics, many metabolites 
appear in more than one place on the chart, so that some pairs 
of vertices are actually the same vertex. 
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inhibitors, so that the genome itself forms a switching 
network with vertices representing the proteins and di- 
rected edges representing dependence of protein produc- 
tion on the proteins at other vertices. The statistical 
structure of regulatory networks has been studied re- 
cently by various authors Genetic regula- 
tory networks were in fact one of the first networked dy- 
namical systems for which large-scale modeling attempts 
were made. Th e early wo rk on random Boolean nets by 
Kauffman |224l 12251 l226j | is a classic in this field, and 
anticipated recent developments by several decades. 

Another much studied example of a biological network 
is the food web, in which the vertices represent species 
in an ecosystem and a directed edge from s pecies A to 
species B indicates that A preys on B [HI HH — see 
Fig. [21}. (Sometimes the relationship is drawn the other 
way around, because ecologists tend to think in terms of 
energy or carbon flows through food webs; a predator- 
prey interaction is thus drawn as an arrow pointing from 
prey to predator, indicating energy flow from prey to 
predator when the prey is eaten.) Construction of com- 
plete food webs is a laborious business, but a number 
of quite extensiv e da t a se t s ha ve become available in 
recent years H3, [HI |2Q2 Ez^. Statistical studies of 
the topologies of f ood webs have been carried out by 
Sole and Mo ntoya l290l I375L Camacho et al. jH and 
Dunne et al. |132L 1 133. 423J, among others. A particu- 
larly thorough study of webs of pla nts an d herbivores has 
been conducted by Jordano et al. |219j . which includes 
statistics for no less than 53 different networks. 

Neural networks are another class of biological net- 
works of considerable importance. Measuring the topol- 
ogy of real neural networks is extremely difficult, but has 
been done successfully in a few cases. The best known 
example is the reconstruction of the 282-neuron neura l 
network of the nematode C. Elegans by White et al. |42l| . 
The network structure of the brain at larger scales than 
individual neurons — functional are as and p athways — has 
been investigated by Sporns et al. |379l l380| . 

Blood vessels and the equivalent vascular networks in 
plants form the foundation for one of the most successful 
theoretical models of the effects of network structure on 
the behavior of a n etworked system, the theory of biolog- 
ical allometry [29J, 14171 l418j , although we are not aware 
of any quantitative studies of their statistical structure. 

Finally we mention two examples of networks from 
the physical sciences, the netw ork o f free energy min- 
ima and saddle points in glasses |l3dj | and the network of 
confor mat ions of polymers and the transitions between 
them H3, both of which appear to have some interest- 
ing structural properties. 



III. PROPERTIES OF NETWORKS 

Perhaps the simplest useful model of a network is the 
random graph, first studie d by Rapoport |346l 13471 l378| 
and by Erdos and Renyi |l4ll Il42l [143], which we de- 



scribe in Sec. IIV.AI In this model, undirected edges are 
placed at random between a fixed number n of vertices to 
create a network in which each of the \n(n — 1) possible 
edges is independently present with some probability p, 
and the number of edges connected to each vertex — the 
degree of the vertex — is distributed according to a bino- 
mial distribution, or a Poisson distribution in the limit 
of large n. The ran dom grap h has been well studied by 
mathematicians |63T.l21lll223| and many results, both ap- 
proximate and exact, have been proved rigorously. Most 
of the interesting features of real-world networks that 
have attracted the attention of researchers in the last few 
years however concern the ways in which networks are 
not like random graphs. Real networks are non-random 
in some revealing ways that suggest both possible mecha- 
nisms that could be guiding network formation, and pos- 
sible ways in which we could exploit network structure 
to achieve certain aims. In this section we describe some 
features that appear to be common to networks of many 
different types. 

A. The small-world effect 

In Sec. III.Al we described the famous experiments car- 
ried out by Stanley Milgram in the 1960s, in which let- 
ters passed from person to person were able to reach a 
designated target individual in only a small number of 
steps — around six in the published cases. This result is 
one of the first direct demonstrations of the small-world 
effect, the fact that most pairs of vertices in most net- 
works seem to be connected by a short path through the 
network. 

The existence of the small- world effect had been specu- 
lated upon before Milgram's work, notably in a remark- 
able 1929 sho rt story by the Hungarian writer Frigyes 
Karinthy | 222) . and more rigo rously in the mathematical 
work of Pool and Kochen [34 1) which, although published 
after Milgram's studies, was in circulation in preprint 
form for a decade before Milgram took up the problem. 
Nowadays, the small- world effect has been studied and 
verified directly in a large number of different networks. 

Consider an undirected network, and let us define £ 
to be the mean geodesic (i.e., shortest) distance between 
vertex pairs in a network: 

where c?y is the geodesic distance from vertex i to ver- 
tex j. Notice that we have included the distance from 
each vertex to itself (which is zero) in this average. This 
is mathematically convenient for a number of reasons, 
but not all authors do it. In any case, its inclusion simply 
multiplies i by (n — l)/(n+l) and hence gives a correc- 
tion of order n" 1 , which is often negligible for practical 
purposes. 

The quantity i can be measured for a network of n ver- 
tices and m edges in time O(ran) using simple breadth- 
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0.59 
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0.16 




136 




email address books 


directed 


16 881 
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WWW nd.edu 


directed 
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word co-occurrence 


undirected 
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0.44 








Internet 
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5.98 


3.31 


2.5 


0.035 


0.39 


-0.189 
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power grid 
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2.67 
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- 
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1.61 
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0.033 


0.012 
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4.34 


11.05 


3.0 


0.010 


0.030 
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peer-to-peer network 


undirected 


880 


1296 


1.47 


4.28 


2.1 


0.012 


0.011 


-0.366 
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metabolic network 


undirected 
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3 686 


9.64 


2.56 


2.2 


0.090 


0.67 


-0.240 


214 


ical 


protein interactions 


undirected 


2115 


2 240 


2.12 


6.80 
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0.072 


0.071 


-0.156 
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bo 


marine food web 


directed 
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4.43 


2.05 




0.16 


0.23 


-0.263 


204 
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freshwater food web 


directed 


92 


997 


10.84 


1.90 




0.20 


0.087 


-0.326 


272 




neural network 


directed 


307 


2 359 


7.68 


3.97 




0.18 


0.28 


-0.226 


416. 421 



TABLE II Basic statistics for a number of published networks. The properties measured are: type of graph, directed or undirected; total number of vertices n; total 
number of edges m; mean degree z; mean vertex-vertex distance l\ exponent a of degree distribution if the distribution follows a power law (or "-" if not; in/out-degree 
exponents are given for directed graphs); clustering coefficient C' 1 ' from Eq. J3J; clustering coefficient C' 2 ' from Eq. @; and degree correlation coefficient r, Sec. IIII.Fl 
The last column gives the citation(s) for the network in the bibliography. Blank entries indicate unavailable data. 
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first search 0, also called a "burning algorithm" in the 
physics literature. In Table HTI we show values of I taken 
from the literature for a variety of different networks. As 
the table shows, the values are in all cases quite small — 
much smaller than the number n of vertices, for instance. 

The definition JQ) of i is problematic in networks that 
have more than one component. In such cases, there 
exist vertex pairs that have no connecting path. Con- 
ventionally one assigns infinite geodesic distance to such 
pairs, but then the value of I also becomes infinite. To 
avoid this problem one usually defines I on such networks 
to be the mean geodesic distance between all pairs that 
have a connecting path. Pairs that fall in two different 
components are excluded from the average. The figures 
in Table ITU were all calculated in this way. An alterna- 
tive and perhaps more satisfactory approach is to define t 
to be the "harmonic mean" geodesic distance between all 
pairs, i.e., the reciprocal of the average of the reciprocals: 

Infinite values of <£y then contribute nothing to the sum. 
This approach has been adopted only occasionally in net- 
work calculations |260| , but perhaps should be used more 
often. 

The small-world effect has obvious implications for the 
dynamics of processes taking place on networks. For 
example, if one considers the spread of information, or 
indeed anything else, across a network, the small-world 
effect implies that that spread will be fast on most real- 
world networks. If it takes only six steps for a rumor 
to spread from any person to any other, for instance, 
then the rumor will spread much faster than if it takes 
a hundred steps, or a million. This affects the number 
of "hops" a packet must make to get from one computer 
to another on the Internet, the number of legs of a jour- 
ney for an air or train traveler, the time it takes for a 
disease to spread throughout a population, and so forth. 
The small-world effect also underlies some well-known 
parlo r gam es, particularly the calculation of Erdos num- 
bers |l07j and Bacon numbers. 10 

On the other hand, the small-world effect is also math- 
ematically obvious. If the number of vertices within a 
distance r of a typical central vertex grows exponentially 
with r — and this is true of many networks, including the 
random graph (Sec. HV.Af) — then the value of i will in- 
crease as logn. In recent years the term "small- world 
effect" has thus taken on a more precise meaning: net- 
works are said to show the small-world effect if the value 
of I scales logarithmically or slower with network size for 
fixed mean degree. Logarithmic scaling can be p r oved 
for a variety of network models [6ll loa , l88l Il27l Il64j 



http : //www. cs .Virginia. edu/oracle/ 




FIG. 5 Illustration of the definition of the clustering coeffi- 
cient C, Eq. This network has one triangle and eight 
connected triples, and therefore has a clustering coefficient of 
3 x 1/8 = |. The individual vertices have local clustering 
coefficients, Eq. Jjj), of 1, 1, |, and 0, for a mean value, 
Eq. ©, of C = |§. 



and has a lso been observed in various real-world net- 
works Some networks have mean vertex- 
vertex distances that increase slower than logn. Bollobas 
and Riordan [64[ have shown that networks with power- 
law degree distributions fSec. HII.C)l have values of I tha t 
increase no faster than logn/ log logn (see also Ref. ll64fl . 
and Cohen and Havlin |95l | have given arguments that 
suggest that the actual variation may be slower even than 
this. 



B. Transitivity or clustering 

A clear deviation from the behavior of the random 
graph can be seen in the property of network transitivity, 
sometimes also called clustering, although the latter term 
also has another meaning in the study of networks (see 
Sec. IIILCj) and so can be confusing. In many networks 
it is found that if vertex A is connected to vertex B and 
vertex B to vertex C, then there is a heightened proba- 
bility that vertex A will also be connected to vertex C. 
In the language of social networks, the friend of your 
friend is likely also to be your friend. In terms of network 
topology, transitivity means the presence of a heightened 
number of triangles in the network — sets of three vertices 
each of which is connected to each of the others. It can 
be quantified by defining a clustering coefficient C thus: 

q 3x number of triangles in the network . 
number of connected triples of vertices ' 

where a "connected triple" means a single vertex with 
edges running to an unordered pair of others (see Fig.[5J). 

In effect, C measures the fraction of triples that have 
their third edge filled in to complete the triangle. The 
factor of three in the numerator accounts for the fact that 
each triangle contributes to three triples and ensures that 
C lies in the range < C < 1. In simple terms, C is 
the mean probability that two vertices that are network 
neighbors of the same other vertex will themselves be 
neighbors. It can also be written in the form 

„ 6x number of triangles in the network , , 

( j — ~ M 

number of paths of length two 
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where a path of length two refers to a directed path start- 
ing from a specified vertex. This definition shows that C 
is also the mean probability that the friend of your friend 
is also your friend. 

The definition of C given here has been widely used 
in the sociology literature, where it is referred to as the 
"fraction of transitive triples." 11 In the mathematical 
and physical literature it seems to have been first dis- 
cussed by Barrat and Weigt 40] . 

An alternative definition of the clustering coefficient, 
also wide ly used, has been given by Watts and Stro- 
gatz |416j . who proposed defining a local value 

q number of triangles connected to vertex i 
number of triples centered on vertex i 

For vertices with degree or 1, for which both numerator 
and denominator are zero, we put Cj = 0. Then the 
clustering coefficient for the whole network is the average 

c = ^£a (6) 

i 

This definition effectively reverses the order of the oper- 
ations of taking the ratio of triangles to triples and of 
averaging over vertices — one here calculates the mean of 
the ratio, rather than the ratio of the means. It tends 
to weight the contributions of low-degree vertices more 
heavily, because such vertices have a small denominator 
in Eq. iJSJ and hence can give quite different results from 
Eq. @. In Table ITT1 we give both measures for a number 
of networks (denoted C^ 1 and in the table). Nor- 
mally our first definition is easier to calculate analyt- 
ically, but © is easily calculated on a computer and has 
found wide use in numerical studies and data analysis. It 
is important when reading (or writing) literature in this 
area to be clear about which definition of the clustering 
coefficient is in use. The difference between the two is 
illustrated in Fig. [3] 

The local clustering Ci above has been used quite 
widely in its own right in the sociological liter ature , 
where it is referred to as the "network density" |363j| . 
Its dependence on the degree ki of the ce ntral ver- 
tex i has bee n stu died by Dorogovtsev et al. |l!3j | and 
Szabo et al. 389]; both groups found that Ci falls 
off with ki approximately as k~ x for certain models 
of scale- free networks (Sec. IIII.C.1|I . Similar behavior 
has al so been obser ved empirically in real-world net- 
works H43,|353,|393. 

In general, regardless of which definition of the clus- 
tering coefficient is used, the values tend to be consid- 
erably higher than for a random graph with a similar 
number of vertices and edges. Indeed, it is suspected 



that for many types of networks the probability that the 
friend of your friend is also your friend should tend to 
a non-zero limit as the network becomes large, so that 
C = 0(1) as n — > 00. 12 On the random graph, by con- 
trast, C = 0(n _1 ) for large n (either definition of C) 
and hence the real-world and random graph values can 
be expected to differ by a factor of order n. This point 
is discussed further in Sec. IIV.AI 

The clustering coefficient measures the density of tri- 
angles in a network. An obvious generalization is to ask 
about the density of longer loops also: loops of length 
four and above. A number of authors have looked at such 
higher order clustering coefficients [53.l79Hl65lll72ll317| . 
although there is so far no clean theory, similar to a cu- 
mulant expansion, that separates the independent contri- 
butions of the various orders from one another. If more 
than one edge is permitted between a pair of vertices, 
then there is also a lower order clustering coefficient that 
describes the density of loops of length two. This coeffi- 
cient is particularly important in directed graphs where 
the two edges in question can point in opposite directions. 
The probability that two vertices in a directed network 
point to each other is called the recip rocity and is often 
measured in directed social networks |363t l409j] . It has 
been examined occasiona lly in other contexts too, s uch a s 
the World Wide Web H[l32l and email networks (HJ. 



C. Degree distributions 

Recall that the degree of a vertex in a network is the 
number of edges incident on (i.e., connected to) that ver- 
tex. We define pk to be the fraction of vertices in the 
network that have degree k. Equivalently, pk is the prob- 
ability that a vertex chosen uniformly at random has 
degree k. A plot of pk for any given network can be 
formed by making a histogram of the degrees of vertices. 
This histogram is the degree distribution for the network. 
In a rando m graph of the type studied by Erdos and 
Renyi [Till HH Il43 | . each edge is present or absent with 
equal probability, and hence the degree distribution is, 
as mentioned earlier, binomial, or Poisson in the limit of 
large graph size. Real-world networks are mostly found 
to be very unlike the random graph in their degree dis- 
tributions. Far from having a Poisson distribution, the 
degrees of the vertices in most networks are highly right- 
skewed, meaning that their distribution has a long right 
tail of values that are far above the mean. 

Measuring this tail is somewhat tricky. Although in 
theory one just has to construct a histogram of the de- 
grees, in practice one rarely has enough measurements to 
get good statistics in the tail, and direct histograms are 



For example, the standard network analysis program UCInet in- 
cludes a function to calculate this quantity for any network. 



An exception is scale- free networks with C; ~ as described 

above. For such networks Eq. tends to zero as n — > 00, 
although Eq. JHJ is still finite. 
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thus us ually rather noisy (see the histograms in Refs.Fil 
Il48l and l343l for example). There are two accepted ways 
to get around this problem. One is to constructed a his- 
togram in which the bin sizes increase exponentially with 
degree. For example the first few bins might cover de- 
gree ranges 1, 2-3, 4-7, 8-15, and so on. The number of 
samples in each bin is then divided by the width of the 
bin to normalize the measurement. This method of con- 
structing a histogram is often used when the histogram 
is to be plotted with a logarithmic degree scale, so that 
the widths of the bins will appear even. Because the bins 
get wider as we get out into the tail, the problems with 
statistics are reduced, although they are still present to 
some extent as long as pk falls off faster than which 
it must if the distribution is to be integrable. 

An alternative way of presenting degree data is to make 
a plot of the cumulative distribution function 

oo 
k'=fc 

which is the probability that the degree is greater than 
or equal to k. Such a plot has the advantage that all the 
original data are represented. When we make a conven- 
tional histogram by binning, any differences between the 
values of data points that fall in the same bin are lost. 
The cumulative distribution function does not suffer from 
this problem. The cumulative distribution also reduces 
the noise in the tail. On the downside, the plot doesn't 
give a direct visualization of the degree distribution it- 
self, and adjacent points on the plot are not statistically 
independent, making correct fits to the data tricky. 

In Fig. we show cumulative distributions of degree 
for a number of the networks described in Sec. [H] As 
the figure shows, the distributions are indeed all right- 
skewed. Many of them follow power laws in their tails: 
Pk ~ k~ a for some constant exponent a. Note that such 
power-law distributions show up as power laws in the 
cumulative distributions also, but with exponent a — 1 
rather than a: 

oo 

k'=k 

Some of the other distributions have exponential tails: 
Pk ~ e~ fc / K . These also give exponentials in the cumula- 
tive distribution, but with the same exponent: 

oo oo 

Pk = ^ Y^e^'l* ~e~ k '\ (9) 

This makes power-law and exponential distributions par- 
ticularly easy to spot experimentally, by plotting the cor- 
responding cumulative distributions on logarithmic scales 
(for power laws) or semi-logarithmic scales (for exponen- 
tials). 

For other types of networks degree distributions can 
be more complicated. For bipartite graphs, for instance 



fSec. lI.A|) . there are two degree distributions, one for each 
type of vertex. For directed graphs each vertex has both 
an in-degree and an out-degree, and the degree distribu- 
tion therefore becomes a function pjk of two variables, 
representing the fraction of vertices that simultaneously 
have in-degree j and out-degree k. In empirical studies 
of directed graphs like the Web, researchers have usually 
given only the individual distributions of in- and out- 
degree [H, yj, [74( , i-e., the distributions derived by sum- 
ming pjk over one or other of its indices. This however 
discards much of the information present in the joint dis- 
tribution. It has been found that in- and out -deg rees are 
quite strongly correlated in some networks [32 1| . which 
suggests that there is more to be gleaned from the joint 
distribution than is normally appreciated. 



1. Scale-free networks 

Networks with power-law degree distributions have 
been the fo cus o f a great deal of attention in the lit- 
erature EE EM H13. They are sometimes referred to 
as scale-free networks |22, although it is only their de- 
gree distributions that are scale-free; 13 one can and usu- 
ally does have scales present in other network properties. 
The earliest published example of a scale-free network is 
probab ly Pr ice's network of citations between scientific 
papers |343j (see Sec. III.Bf) . He quoted a value of a = 2.5 
to 3 for the exponent of his network. In a later paper he 
quoted a more accurate figure of a — 3.04 |344| . He also 
found a power-law distribution for the out-degree of the 
network (number of bibliography entries in each p aper) , 
although later work has called this into question [396j |. 
More recently, power-law degree distributions have been 
observed in a host of othe r network s, including no- 
tably other citation networks l35li 13641 the World Wide 
Web HBEI, the Internet [86L Il48l EmI metabolic 
networks |212ll214| , telephone call g raphsla B , and the 
network of human sexual contacts |218t ,266]. The de- 
gree distributions of some of these networks are shown in 
Fig.© 

Other common functional forms for the degree distri- 
bution are exponentials, such as th ose seen in the power 
grid [2(j and railway networks |366| , and power laws with 
exponential cutoffs, such as those seen in the netw ork o f 
movie actors [2(j and some collaboration networks |313| . 
Note also that while a particular form may be seen in the 
degree distribution for the network as a whole, specific 
subnetworks within the network can have other forms. 
The World Wide Web, for instance, shows a power-law 



The term "scale-free" refers to any functional form f(x) that re- 
mains unchanged to within a multiplicative factor under a rescal- 
ing of the independent variable x. In effect this means power-law 
forms, since these are the only solutions to f(ax) = bf(x), and 
hence "power-law" and "scale-free" are, for our purposes, syn- 
onymous. 
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FIG. 6 Cumulative degree distributions for six different networks. The horizontal axis for each panel is vertex degree k (or in- 
degree for the citation and Web networks, which are directed) and the vertical axis is the cumulative probability distribution of 
degrees, i.e., the fraction of vertic es that have degree greater than or equal to k. The networks shown are: (a) the collaboration 
network of mathe maticians |182| ; (b) citations between 1981 and 1997 to all papers cataloged by the Institute for Scientific 
Information |35l| : (c) a 300 million vertex subset of the World Wide Web, circa 1999 E|; (d) the Internet at the level of 
autonomous systems, April 1999 |8q|: (e) the power grid of the western United States |416l| : (f) the interaction network of 
proteins in the metabolism of the yeast S. Cerevisiae |212|1 . Of these networks, three of them, (c), (d) and (f), appear to have 
power-law degree distributions, as indicated by their approximately straight-line forms on the doubly logarithmic scales, and 
one (b) has a power-law tail but deviates markedly from power-law behavior for small degree. Network (e) has an exponential 
degree distribution (note the log-linear scales used in this panel) and network (a) appears to have a truncated power-law degree 
distribution of some type, or possibly two separate power-law regimes with different exponents. 



degree distribution overall but unimodal distributions 
within domains 3381. 



2. Maximum degree 

The maximum degree fc max of a vertex in a network 
will in general depend on the size of the network. For 
some calculations on networks the value of this maxi- 
mum degree matters (see, for example, Sec. IVIII.C.2ll . 
In work on scale-free networks, Aiello et al. [8j assumed 
that the maximum degree was approximately the value 
above which there is less than one vertex of that degree in 
the graph on average, i.e., the point where npk = 1. This 
means, for instance, that /e max ~ n 1 ^" for the power-law 
degree distribution pk ~ k~ a . This assumption however 
can give misleading results; in many cases there will be 
vertices in the network with significantly higher degree 
than this, as discussed by Adamic et al. jg. 

Given a particular degree distribution (and assuming 
all degrees to be sampled independently from it, which 
may not be true for networks in the real world) , the prob- 
ability of there being exactly m vertices of degree k and 



no vertices of higher degree is (™J.P™(1 — Pk) n ~ m , where 
Pk is the cumulative probability distribution, Eq. JJJ. 
Hence the probability hk that the highest degree on the 
graph is k is 



** = £ ( m )pu{i-p k ) n - m 

m=l ^ ' 

= (p fc + 1 - P k ) n - (i - PkY 



(10) 



and the expected value of the highest degree is £; max = 

T,k kh k- 

For both small and large values of k, hk tends to zero, 
and the sum over k is dominated by the terms close to the 
maximum. Thus, in most cases, a good approximation 
to the expected value of the maximum degree is given 
by the modal value. Differentiating and observing that 
dPk/dk = Pk, we find that the maximum of hk occurs 
when 



or fc max is a solution of 

dpk 
dk 



-p k (l-P k ) n - x =0, (11) 



-npl, 



(12) 
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where we have made the (fairly safe) assumption that 
Pk is sufficiently small for k > fc max that np^ -C 1 and 
Pk « 1. 

For example, ifpfc ~ A;~ Q in its tail, then we find that 



fen 



,l/(a-l) 



(13) 



As shown by Cohen ef al. [93] , a simple rule of thumb that 
leads to the same result is that the maximum degree is 
roughly the value of k that solves nP^ = 1. Not e how ever 
that, as shown by Dorogovtsev and Samukhin |129| . the 
fluctuations in the tail of the degree distribution are very 
large for the power-l aw ca se. 

Dorogovtsev et al. |l26j | have also shown that Eq. 113(1 
holds for networks generated using the "preferential at- 
tachment" procedure of Barabasi and Albert de- 
scribed in Sec. IVII.BI and a detailed numerical stud y 
of this case has been carried out by Moreira et al. |295| . 



D. Network resilience 

Related to degree distributions is the property of re- 
silience of networks to the removal of their vertices, which 
has been the subject of a good deal of attention in the 
literature. Most of the networks we have been consider- 
ing rely for their function on their connectivity, i.e., the 
existence of paths leading between pairs of vertices. If 
vertices are removed from a network, the typical length of 
these paths will increase, and ultimately vertex pairs will 
become disconnected and communication between them 
through the network will become impossible. Networks 
vary in their level of resilience to such vertex removal. 

There are also a variety of different ways in which ver- 
tices can be removed and different networks show vary- 
ing degrees of resilience to these also. For example, one 
could remove vertices at random from a network, or one 
could target some specific class of vertices, such as those 
with the highest degrees. Network resilience is of partic- 
ular importance in epidemiology, where "removal" of ver- 
tices in a contact network might correspond for example 
to vaccination of individuals against a disease. Because 
vaccination not only prevents the vaccinated individuals 
from catching the disease but may also destroy paths be- 
tween other individuals by which the disease might have 
spread, it can have a wider reaching effect than one might 
at first think, and careful consideration of the efficacy of 
different vaccination strategies could lead to substantial 
advantages for public health. 

Recent interest in network resilience has been sparked 
by the work of Albert et al. E3j who studied the ef- 
fect of vertex deletion in two example networks, a 6000- 
vertex network representing the topology of the Internet 
at the level of autonomous systems (see Sec. III. CJf) . and 
a 326 000-page subset of the World Wide Web. Both of 
the Internet and the Web have been observed to have de- 
gree distributions tha t are approximately power-law in 
form El El IH EH [|0l| (Sec. llff.C.fl) . The authors 
measured average vertex-vertex distances as a function 
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fraction of vertices removed 

FIG. 7 Mean vertex-vertex distance on a graph represen- 
tation of the Internet at the autonomous system level, as 
vertices are removed one by one. If vertices are removed in 
random order (squares), distance increases only very slightly, 
but if they are removed in order of their degrees, starting with 
the highest degree vertices (circles), then distance increases 
sharply. After Albert et al. |lal . 



of number of vertices removed, both for random removal 
and for progressive removal of the vertices with the high- 
est degrees. 14 In Fig. we show their results for the 
Internet. They found for both networks that distance 
was almost entirely unaffected by random vertex removal, 
i.e., the networks studied were highly resilient to this type 
of removal. This is intuitively reasonable, since most 
of the vertices in these networks have low degree and 
therefore lie on few paths between others; thus their re- 
moval rarely affects communications substantially. On 
the other hand, when removal is targeted at the high- 
est degree vertices, it is found to have devastating effect. 
Mean vertex-vertex distance increases very sharply with 
the fraction of vertices removed, and typically only a few 
percent of vertices need be removed before essentially all 
communication through the network is destroyed. Al- 
bert et al. expressed their results in terms of failure or 
sabotage of network nodes. The Internet (and the Web) 
they suggest, is highly resilient against the random fail- 
ure of vertices in the network, but highly vulnerable to 
deliberate attack on its highest-degree vertices. 

Similar results to those of Albert et al. were found in- 
dependently by Broder et al. |7^| for a much larger subset 
of the Web graph. Interestingly, however, Broder et al. 



14 In removing the vertices with the highest degrees. Albert et al. 
recalculated degrees following the removal of each vertex. Most 
other authors who have studied this issue have adopted a slightly 
different strategy of removing vertices in order of their initial 
degree in the network before any removal. 
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gave an entirely opposite interpretation of their results. 
They found that in order to destroy connectivity in the 
Web one has to remove all vertices with degree greater 
than five, which seems like a drastic attack on the net- 
work, given that some vertices have degrees in the thou- 
sands. They thus concluded that the network was very 
resilient against targeted attack. In fact however there 
is not such a conflict between these results as at first ap- 
pears. Because of the highly skewed degree distribution 
of the Web, the fraction of vertices with degree greater 
than five is only a small fraction of all vertices. 

Following these studies, many authors have looked into 
the question of resilience for other networks. In gen- 
eral the picture seems to be consistent with that seen 
in the Internet and Web. Most networks are robust 
against random vertex removal but considerably less ro- 
bust to targe ted r emoval of the highest-degree vertices. 
Jeong et al. 12121 hav e looked at metabolic netw orks , 
Dunne et al. |l32tll33| at food webs, Newman et al. |32l| 
at email networks, and a variet y of a uthors at resilience of 
model networks [3 EJ HU , which we discuss in 
more detail in later sections of the review. A particularly 
thorough study of the resilience of both real-worl d and 
model networks has been conducted by Holme et al. |20d| , 
who looked not only at vertex removal but also at removal 
of edges, and considered some additional strategies for 
selecting vertices based on so-called "betweenness" (see 
Sees. InTTTl and IITTjIi . 



E. Mixing patterns 

Delving a little deeper into the statistics of network 
structure, one can ask about which vertices pair up with 
which others. In most kinds of networks there are at 
least a few different types of vertices, and the proba- 
bilities of connection between vertices often depends on 
types. For example, in a food web representing which 
species eat which in an ecosystem (Sec. III.D|) one sees 
vertices representing plants, herbivores, and carnivores. 
Many edges link the plants and herbivores, and many 
more the herbivores and carnivores. But there are few 
edges linking herbivores to other herbivores, or carni- 
vores to plants. For the Internet, Maslov et al. |275j 
have proposed that the structure of the network reflects 
the existence of three broad categories of nodes: high- 
level connectivity providers who run the Internet back- 
bone and trunk lines, consumers who are end users of 
Internet service, and ISPs who join the two. Again there 
are many links between end users and ISPs, and many 
between ISPs and backbone operators, but few between 
ISPs and other ISPs, or between backbone operators and 
end users. 

In social networks this kind of selective linking is called 
assortative mixing or homophily and has been widely 
studied, as it has also in epidemiology. (The term "as- 
sortative matching" is also seen in the ecology literature, 
particularly in reference to mate choice among animals.) 
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TABLE III Couples in the study of Catania et al . |85T | tabu- 
lated by race of either partner. After Morris |302H . 

A classic example of assortative mixing in social networks 
is mixing by race. Table II I II for example reproduces re- 
sults from a study of 1 958 couples in the city of San 
Francisco, California. Among other things, the study 
recorded the race (self-identified) of study participants in 
each couple. As the table shows, participants appear to 
draw their partners preferentially from those of their own 
race, and this is believed to be a common phenomenon in 
many social networks: we tend to associate preferentially 
with people who are similar to ourselves in some way. 

Assortative mixing can be quantified by an "assorta- 
tivity coefficient," which can be defined in a couple of dif- 
ferent ways. Let be the number of edges in a network 
that connect vertices of types i and j, with i,j = l...N, 
and let E be the matrix with elements E^, as depicted 
in Tabic ITTT1 We define a normalized mixing matrix by 

e = PHP (14) 

where || x || means the sum of all the elements of the ma- 
trix x. The elements measure the fraction of edges 
that fall between vertices of types i and j. One can also 
ask about the conditional probability P{j\i) that my net- 
work neighbor is of type j given that I am of type i, which 
is given by P(j\i) — e^j J2j e ij- These quantities satisfy 
the normalization conditions 

5> i = l, ^P(j|i) = l. (15) 

ij J 

Gupta et al. 186] have suggested that assortative mix- 
ing be quantified by the coefficient 

This quantity has the desirable properties that it is 1 for 
a perfectly assortative network (every edge falls between 
vertices of the same type), and for randomly mixed 
networks, and it has been quite widely used in th e litera- 
ture. But it suffers from two shortcomings |318j| : (1) for 
an asymmetric matrix like the one in Table ITTT1 Q has two 
different values, depending on whether we put the men 
or the women along the horizontal axis, and it is unclear 
which of these two values is the "correct" one for the net- 
work; (2) the measure weights each vertex type equally, 
regardless of how many vertices there are of each type, 
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which can give rise to misleading figures for Q in cases 
where community size is heterogeneous, as it often is. 

An alternative assortativit y co efficient that remedies 
these problems is defined by |318| | 



Tre 



1 



(17) 



This quantity is also in a randomly mixed network 
and 1 in a perfectly assortative one. But its value is 
not altered by transposition of the matrix and it weights 
vertices equally rather than communities, so that small 
communities make an appropriately small contribution 
to r. For the data of Table ITTT1 we find r = 0.621. 

Another type of assortative mixing is mixing by scalar 
characteristics such as age or income. Again it is usually 
found that people prefer to associate with others of simi- 
lar age and income to themselves, although of course age 
and income, like race, may be proxies for other dr iving 
forces, such as cult ural differences. Garfinkel et al. |l70j 
and Newman |318| |. for example, have analyzed data for 
unmarried and married couples respectively to show that 
there is strong correlation between the ages of partners. 
Mixing by scalar characteristics can be quantified by cal- 
culating a correlation coefficient for the characteristic in 
question. 

In theory assortative mixing according to vector char- 
acteristics should also be possible. For example, geo- 
graphic location probably affects individuals' propensity 
to become acquainted. Location could be viewed as a 
two-vector, with the probability of connection between 
pairs of individuals being assortative on the values of 
these vectors. 



F. Degree correlations 

A special case of assortative mixing according to a 
scalar vertex property is mixing according to vertex de- 
gree, also commonly referred to simply as degree corre- 
lation. Do the high-degree vertices in a network asso- 
ciate preferentially with other high-degree vertices? Or 
do they prefer to attach to low-degree ones? Both situ- 
ations are seen in some networks, as it turns out. The 
case of assortative mixing by degree is of particular in- 
terest because, since degree is itself a property of the 
graph topology, degree correlations can give rise to some 
interesting network structure effects. 

Several different ways of quantifying degree co rrela- 
tions have been proposed. Maslov et al. \27rn. |275| have 
simply plotted the two-dimensional histogram of the de- 
grees of vertices at either ends of an edge. They have 
shown results for protein interaction networks and the 
Internet. A more compact representation of the situa- 
tion is that proposed by Pastor-Satorras et al. |33lll40l| . 
who in studies of the Internet calculated the mean de- 
gree of the network neighbors of a vertex as a function of 
the degree k of that vertex. This gives a one-parameter 



curve which increases with k if the network is assorta- 
tively mixed. For the Internet in fact it is found to de- 
creas e with fc, a situation we call disassortativity. New- 
man |3l4 l31S| reduced the measurement still further to 
a single number by calculating the Pearson correlation 
coefficient of the degrees at either ends of an edge. This 
gives a single number that should be positive for assor- 
tatively mixed networks and negative for disassortative 
ones. In TablelTTlwe show results for a number of different 
networks. An interesting observation is that essentially 
all social networks measured appear to be assortative, 
but other types of networks (information networks, tech- 
nological networks, biological networks) appear to be dis- 
assortative. It is not clear what the explanation for this 
result is, or even if there is any one single explanation. 
(Probably there is not.) 



G. Community structure 

It is widely assumed |363l l409j that most social net- 
works show "community structure," i.e., groups of ver- 
tices that have a high density of edges within them, with 
a lower density of edges between groups. It is a matter 
of common experience that people do divide into groups 
along lines of interest, occupation, age, and so forth, and 
the phenomenon of assortativity discussed in Sec. IIII.EI 
certainly suggests that this might be the case. (It is pos- 
sible for a network to have assortative mixing but no 
community structure. This can occur, for example, when 
there is assortative mixing by age or other scalar quanti- 
ties. Networks with this type of structure are sometimes 
said to be "stratified." ) 

In Fig. [S] we show a visualization of the friendship net- 
work of child ren in a US school taken from a study by 
Moody |29l| . 15 The figure was created using a "spring 
embedding" algorithm, in which linear springs are placed 
between vertices and the system is relaxed using a first- 
order energy minimization. We have no special reason 
to suppose that this very simple algorithm would reveal 
anything particularly useful about the network, but the 
network appears to have strong enough community struc- 
ture that in fact the communities appear clearly in the 
figure. Moreover, when Moody colors the vertices ac- 
cording to the race of the individuals they represent, as 
shown in the figure, it becomes immediately clear that 
one of the principal divisions in the network is by indi- 
viduals' race, and this is presumably what is driving the 
formation of communities in this case. (The other princi- 
pal division visible in the figure is between middle school 
and high school, which are age divisions in the American 
education system.) 



15 This image does not appear in the paper cited, but it and a 
number of other images from the same study can be found on 
the Web at http://www.sociology.ohio-state.edu/jwm/. 
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FIG. 8 Friendship network of children in a US school. Friendships are determined by asking the participants, and hence are 
directed, since A may say that B is their friend but not vice versa. Vertices are color coded according to race, as marked, and 
the split from left to right in the figure is clearly primarily along lines of race. The split from top to bottom is between middle 
school and high school, i.e., between younger and older children. Picture courtesy of James Moody. 



It would be of some interest, and indeed practical im- 
portance, were we to find that other types of networks, 
such as those those listed in Table [HJ show similar group 
structure also. One might well imagine for example 
that citation networks would divide into groups repre- 
senting particular areas of research interest, and a good 
deal of en ergy has been invested in studies of this phe- 
nomenon |lOlL Il38j . Similarly communities in the World 
Wide Web might reflect the subject matter of pages, com- 
munities in metabolic, neural, or software networks might 
reflect functional units, communities in food webs might 
reflect subsystems within ecosystems, and so on. 

The traditional method for extracting community 
structure from a network is cluster analysis |147| . some- 
times also called hierarchical clustering. 16 In this 
method, one assigns a "connection strength" to vertex 
pairs in the network of interest. In general each of the 
\n(n — 1) possible pairs in a network of n vertices is 
assigned such a strength, not just those that are con- 
nected by an edge, although there are versions of the 



method where not all pairs are assigned a strength; in 
that case one can assume the remaining pairs to have a 
connection strength of zero. Then, starting with n ver- 
tices with no edges between any of them, one adds edges 
in order of decreasing vertex-vertex connection strength. 
One can pause at any point in this process and examine 
the component structure formed by the edges added so 
far; these components are taken to be the communities 
(or "clusters") at that stage in the process. When all 
edges have been added, all vertices are connected to all 
others, and there is only one community. The entire pro- 
cess can be represented by a tree or dendrogram of union 
operations between vertex sets in which the communities 
at any level correspond to a horizontal cut through the 
tree — see Fig. El 17 

Clustering is possible according to many different defi- 
nitions of the connection strength. Reasonable choices in- 
clude various weighted vertex-vertex distance measures, 
the sizes of minimum cut-sets (i.e., maximum flow) 0, 



Not to be confused with the entirely different use of the word 
clustering introduced in Sec. lilt ,BI 



For some reason such trees are conventionally depicted with their 
"root" at the top and their "leaves" at the bottom, which is not 
the natural order of things for most trees. 
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FIG. 9 An example of a dendrogram showing the hierarchical 
clustering of ten vertices. A horizontal cut through the den- 
drogram, such as that denoted by the dotted line, splits the 
vertices into a set of communities, five in this case. 



and weighted path counts between vertices. Recently a 
number of authors have had success with methods based 
on "edge betweenness," which is the count of how many 
geodesic pat hs between vertice s run along each edge in 
the network |l7lLll85lll97ll422j . Results appear to show 
that, for social and biological networks at least, commu- 
nity structure is a common network property, although 
some food webs are found not to break up into commu- 
nities in any simple way. (Food webs may be different 
from other networks in that they appear to be dense: 
mean vertex degree increases roughly linearly with net- 
work size, rathe r than rem aining constant as it does in 
most networks |l32t |273| . The same may be true of 
metabolic networks also [P. Holme, personal communi- 
cation].) 

Network clustering should not be confused with the 
technique of data clustering, which is a way of detect- 
ing gro upin gs of data-points in high-dimensional data 
spaces |208| . The two problems do have some com- 
mon features however, and algorithms for one can be 
adapted for the other, and vice versa. For example, high- 
dimensional data can be converted into a network by 
placing edges between closely spaced data points, and 
then network clustering algorithms can be applied to the 
result. On balance, however, one normally finds that al- 
gorithms specially devised for data clustering work better 
than such borrowed methods, and the same is true in re- 
verse. 

In the social networks literature, network clustering 
has been discussed to a great extent in the context of 
so-called block models, |7lll419j which are essentially just 
divisions of networks into communities or blocks accord- 
ing to one criterion or another. Sociologists have concen- 
trated particularly on structural equivalence. Two ver- 
tices in a network are said to be structurally equivalent 
if they have all of the same neighbors. Exact structural 
equivalence is rare, but approximate equivalence can be 
used as the basis for a hierarchical clustering method such 
as that described above. 

Another slightly different question about community 
structure, but related t o the one discussed here, has been 
studied by Flake et al. 158]: if one is given an example 
vertex drawn from a known network, can one identify the 
community to which it belongs? Algorithmic methods for 
answering this question would clearly be of some practical 



value for searching networks such as the World Wide Web 
and citation networks. Flake et al. give what appears to 
be a very successful algorithm, at least in the context of 
the Web, based on a maximum flow method. 



H. Network navigation 

Stanley Milgram's famous small-world experiment 
fSec. ITLATi . in which letters were passed from person to 
person in an attempt to get them to a desired target 
individual, showed that there exist short paths through 
social networks between apparently distant individuals. 
However, there is another conclusion that can be drawn 
from this experiment which Milgram apparentl y failed to 
notice; it was pointed out in 2000 by Klcinberg |238ll239j . 
Milgram's results demonstrate that there exist short 
paths in the network, but they also demonstrate that 
ordinary people are good at finding them. This is, upon 
reflection, perhaps an even more surprising result than 
the existence of the paths in the first place. The partic- 
ipants in Milgram's study had no special knowledge of 
the network connecting them to the target person. Most 
people know only who their friends are and perhaps a few 
of their friends' friends. Nonetheless it proved possible 
to get a message to a distant target in only a small num- 
ber of steps. This indicates that there is something quite 
special about the structure of the network. On a random 
graph for instance, as Kleinberg pointed out, short paths 
between vertices exist but no one would be able to find 
them given only the kind of information that people have 
in realistic situations. If it were possible to construct arti- 
ficial networks that were easy to navigate in the same way 
that social networks appear to be, it has been suggested 
they could be used to build efficient database s truc tures 
or better peer-to-peer computer networks |5j, l415| (see 
Sec. IVllL0.3t . 



I. Other network properties 

In addition to the heavily studied network properties 
of the preceding sections, a number of others have re- 
ceived some attention. In some networks the size of the 
largest component is an important quantity. For exam- 
ple, in a communication network like the Internet the size 
of the largest component represents the largest fraction 
of the network within which communication is possible 
and hence is a measure of the e ffectivene ss of the network 
at doing its job 0, HJ |H |H EU |323| . The size of the 
largest component is often equated with the graph theo- 
retical concept of the "giant component" (see Sec. lIV.A")l . 
although technically the two are only the same in the 
limit of large graph size. The size of the second-largest 
component in a network is also measured sometimes. In 
networks well above the density at which a giant compo- 
nent first forms, the largest component is expected to be 
much larger than the second largest fSec. IIV.A"|) . 
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Goh et al. [l 75| have made a statistical study of the 
distribution of the "betweenness centrality" of vertices in 
networks. The betweenness centrality of a vertex i is the 
number of geodesic paths between other vertices that run 
through i 161, 363, 409J. Goh et al. show that between- 
ness appears to follow a power law for many networks 
and propose a classification of networks into two kinds 
based on the exponent of this power law. Betweenness 
centrality can also be viewed as a measure of network 
resilience |200tl312j — it tells us how many geodesic paths 
will get longer when a vertex is remov ed from the net- 
work. Latora and Marchiori |26Ct l26l| have considered 
the harmonic mean distance between a vertex and all oth- 
ers, which they call the "efficiency" of the vertex. This, 
like betweenness centrality, can be viewed as a measure 
of network resilience, indicating how much effect on path 
length the removal of a vertex will have. A number of 
authors have looked at the eigenvalue spectra and eigen- 
vectors of the graph Laplacian ( or equiva lently the adja- 
cency matrix) of a network (5^, Il46l Il5l| , which tells us 
about diffusion or vibration modes of the network, and 
about vertex centrality [66l |67j (see also the discussion 
of network se arch s trate gies in Sec. IVIII.C.lJl . 

Milo et al. |284 l368fl have presented a novel analysis 
that picks out recurrent motifs — small subgraphs — from 
complete networks. They apply their method to genetic 
regulatory networks, food webs, neural networks and the 
World Wide Web, finding different motifs in each case. 
They have also made suggestions about the possible func- 
tion of these motifs within the networks. In regulatory 
networks, for instance, they identify common subgraphs 
with particular switching functions in the system, such 
as gates and other feed-forward logical operations. 



IV. RANDOM GRAPHS 

The remainder of this review is devoted to our pri- 
mary topic of study, the mathematics of model networks 
of various kinds. Recent work has focused on models 
of four general types, which we treat in four following 
sections. In this section we look at random graph mod- 
els, startin g wi t h th e classic Poisson rando m graph o f 
Rapoport [34l |378| and Erdos and Renyi [itl Il42|. 
and concentrating particularly on the ge nera l ized ran- 
dom graphs studied by Molloy and Reed |287l |288| and 
others. In Sec. we look at the somewhat neglected but 
potentially very useful Markov graphs and their more 
general forms, exponential random graphs and p* mod- 
els. In Section IVT1 we look at the "small- world model" of 
Watts and Strogatz |416j | and its generalizations. Then 
in Section IV11I we look at mode ls of growing networks, 
particularly the models of Price |344| and Barabasi and 
Albert |32|. and generalizations. Finally, in Section lYlIII 
we look at a number of models of processes occurring on 
networks, such as search and navigation processes, and 
network transmission and epidemiology. 

The first serious attempt at constructing a model for 



large and (apparently) random networ ks was th e "ran- 
dom net" of Rapoport and collaborators |346ll378j . which 
was indepe ndent ly rediscovered a decade later by Erdos 
and Renyi |l4l| . who studied it exhaustively and rig- 
orously, and who gave it the name "random graph" by 
which it is most often known today. Where necessary, we 
will here refer to it as the "Poisson random graph," to 
avoid confusion with other random graph models. It is 
also sometimes called the "Bernoulli graph." As we will 
see in this section, the random graph, while illuminating, 
is inadequate to describe some important properties of 
real-world networks, and so has been extended in a va- 
riety of ways. In particular, the random graph's Poisson 
degree distribution is quite unlike the highly skewed dis- 
tributions of Section IIII.CI and Fig. |SJ Extensions of the 
model to allow for other degree distributions lead to the 
class of models known as "generalized random graphs," 
"random graphs with arbitrary degree distributions" and 
the "configuration model." 

We here look first at the Poisson random graph, and 
then at its generalizations. Our treatment of the Poisson 
case is brief. A much more thorough treatment can be 
found in the books by Bollobas_|62| and Janson et al. |21l| 
and the review by Karohski [223) . 



A. Poisson random graphs 

Solomonoff an d Ra poport |378| and independently 
Erdos and Renyi |l4l| proposed the following extremely 
simple model of a network. Take some number n of ver- 
tices and connect each pair (or not) with probability p 
(or 1 — p). 18 This defines the model that Erdos and Renyi 
called Gn tP . In fact, technically, G n _ p is the ensemble of 
all such graphs in which a graph having m edges appears 
with probability p m (l — p) M ~ n \ where M = ^n(n — 1) 
is the maximum possible number of edges. Erdos and 
Renyi also defined another, related model, which they 
called G n ,m, which is the ensemble of all graphs hav- 
ing n vertices and exactly m edges, each possible graph 
appearing with equal probability. 19 Here we will dis- 
cuss G n ,p, but most of the results carry over to G n ^ m in 
a straightforward fashion. 

Many properties of the random graph are exactly solv- 
able in the limit of large graph size, as was shown by 



Slight variations on the model are possible depending one 
whether one allows self-edges or not (i.e., edges that connect a 
vertex to itself), but this distinction makes a negligible difference 
to the average behavior of the model in the limit of large n. 
Those familiar with statistical mechanics will notice a similar- 
ity between these two models and the so-called canonical and 
grand canonical ensembles. In fact, the analogy is exact, and one 
can define equivalents of the Helmholtz and Gibbs free energies, 
which are generating functions for moments of graph properties 
over the distribution of graphs and which are related by a La- 
grange transform with respect to the "field" p and the "order 
parameter" m. 
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Erdos an d Renyi in a series of papers in the 1960s |l4ll 
Eilllil- Typically the limit of large n is taken holding 
the mean degree z = pin — 1) constant, in which case the 
model clearly has a Poisson degree distribution, since the 
presence or absence of edges is independent, and hence 
the probability of a vertex having degree fc is 



Pk 



p k (l-p) n - 



k —7 

z e 



fc! 



(18) 



with the last approximate equality becoming exact in the 
limit of large n and fixed fc. This is the reason for the 
name "Poisson random graph." 

The expected structure of the random graph varies 
with the value of p. The edges join vertices together 
to form components, i.e., (maximal) subsets of vertices 
that are connected by paths through the network. Both 
Solomonoff and Rapoport and also Erdos and Renyi 
demonstrated what is for our purposes the most impor- 
tant property of the random graph, that it possesses what 
we would now call a phase transition, from a low-density, 
low-p state in which there are few edges and all compo- 
nents are small, having an exponential size distribution 
and finite mean size, to a high-density, high-p state in 
which an extensive (i.e., O(n)) fraction of all vertices are 
joined together in a single giant component, the remain- 
der of the vertices occupying smaller components with 
again an exponential size distribution and finite mean 
size. 

We can calculate the expected size of the giant compo- 
nent from the following simple heuristic argument. Let 
u be the fraction of vertices on the graph that do not 
belong to the giant component, which is also the proba- 
bility that a vertex chosen uniformly at random from the 
graph is not in the giant component. The probability 
of a vertex not belonging to the giant component is also 
equal to the probability that none of the vertex's network 
neighbors belong to the giant component, which is just 
U if the vertex has degree fc. Averaging this expression 
over the probability distribution of fc, Eq. 1)18(1 . we then 
find the following self-consistency relation for u in the 
limit of large graph size: 



k=0 



k=0 



fc! 



(19) 



The fraction S of the graph occupied by the giant com- 
ponent is S = 1 — u and hence 



S = 1 - e 



-zS 



(20) 



By an argument only slightly more complex, which we 
give in the following section, we can show that the mean 
size (s) of the component to which a randomly chosen 
vertex belongs (for non-giant components) is 



1 



1 - z + zS 



(21) 



The form of these two quantities is shown in Fig. ^| 
Equation l|20|) is transcendental and has no closed-form 



Z 6 - 




E 



FIG. 10 The mean component size (solid line), excluding the 
giant component if there is one, and the giant component 
size (dotted line), for the Poisson random graph, Eqs. I20H 
and OTt . 



solution, but it is easy to see that for z < 1 its only non- 
negative solution is S = 0, while for z > 1 there is also 
a non-zero solution, which is the size of the giant com- 
ponent. The phase transition occurs at z = 1. This is 
also the point at which (s) diverges, a behavior that will 
be recognized by those familiar with the theory of phase 
transitions: S plays the role of the order parameter in 
this transition and (s) the role of the order-parameter 
fluctuations. The corresponding critical exponents, de- 
fined by S ~ (z — l)' 3 and (s) ~ \z— 1|~ 7 , take the values 
f3 = 1 and 7 = 1. Precisely at the transition, z — 1, there 
is a "double jump" — the mean size of the largest compo- 
nent in the graph goes as 0(n 2 / 3 ) for z — 1, rather than 
0(n) as it does above the transition. The components 
at the transition have a power-law size distribution with 
exponent r = | (or | if one asks about the component 
to which a randomly chosen vertex belongs) . We look at 
these results in more detail in the next section for the 
more general "configuration model." 

The random graph reproduces well one of the prin- 
cipal features of real-world networks discussed in Sec- 
tion IIIII namely the small- world effect. The mean num- 
ber of neighbors a distance £ away from a vertex in a 
random graph is z d , and hence the value of d needed to 
encompass the entire network is z l ~ n. Thus a typical 
distance through the network is £ = log nj log z, which 
satisfies the definition of the small-world effect given in 
Sec. 1111. Al Rigorous results to this effect can be found 
in, for instance, Refs. and iB.li However in almost all 
other respects, the properties of the random graph do not 
match those of networks in the real world. It has a low 
clustering coefficient: the probability of connection of two 
vertices is p regardless of whether they have a common 
neighbor, and hence C — p, which tends to zero as n^ 1 in 
the limit of large system size |416j . The model also has a 
Poisson degree distribution, quite unlike the distributions 
in Fig. [5] It has entirely random mixing patterns, no cor- 
relation between degrees of adjacent vertices, no commu- 
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nity structure, and navigation is impossible o n a random 
graph using local algorithms p3l l239i l3ll I3TI EoH - 
In short it makes a good straw man but is rarely taken 
seriously in the modeling of real systems. 

Nonetheless, much of our basic intuition about the way 
networks behave comes from the study of the random 
graph. In particular, the presence of the phase transi- 
tion and the existence of a giant component are ideas 
that underlie much of the work described in this review. 
One often talks about the giant component of a network, 
meaning in fact the largest component; one looks at the 
sizes of smaller components, often finding them to be 
much smaller than the largest component; one sees a gi- 
ant component transition in many of the more sophisti- 
cated models that we will look at in the coming sections. 
All of these are ideas that started with the Poisson ran- 
dom graph. 



B. Generalized random graphs 

Random graphs can be extended in a variety of ways to 
make them more realistic. The property of real graphs 
that is simplest to incorporate is the property of non- 
Poisson degree distributions, which leads us to the so- 
called "configuration model." Here we examine this 
model in detail; in Sec. ITV.B.3dTVJT5l we describe fur- 
ther generalizations of the random graph to add other 
features. 



1. The configuration model 

Consider the model defined in the following way. We 
specify a degree distribution pk , such that pk is the frac- 
tion of vertices in the network having degree k. Wc 
choose a degree sequence, which is a set of n values of 
the degrees fcj of vertices i — 1 . . . n, from this distribu- 
tion. We can think of this as giving each vertex i in our 
graph hi "stubs" or "spokes" sticking out of it, which are 
the ends of edges-to-be. Then we choose pairs of stubs 
at random from the network and con nect them together. 
It is straightforward to demonstrate |287| that this pro- 
cess generates every possible topology of a graph with 
the given degree sequence with equal probability. 20 The 
configuration model is defined as the ensemble of graphs 
so produced, with each having equal weight. 21 



Since the 1970s the configuration model has been stud- 
ied by a number of authors EI El III HI IH EH EH 
I288L 13231 14251 . An exact condition is known in term s 
of pk for the model to possess a giant compo nent [287| . 
the expected size of that component is known j288| , and 
the average size of non-giant components both above and 
below the transition is known |323j , along with a variety 
of other properties, such as mean numbers of vertices a 
given distance away from a central vertex and typical 
vertex-vertex distances |88| . Here we give a brief deriva- 
tion of the main results u sing the generating function for- 
malism of Newman et al. |323l | . More rigorous t reatments 
of the same results can be found in 

Refs.iUHEHEH 

There are two important points to grasp about the 

configuration model. First, pk is, in the limit of large 
graph size, the distribution of degrees of vertices in our 
graph, but the degree of the vertex we reach by following 
a randomly chosen edge on the graph is not given by pk ■ 
Since there are k edges that arrive at a vertex of degree k, 
we are k times as likely to arrive at that vertex as we 
are at some other vertex that has degree 1. Thus the 
degree distribution of the vertex at the end of a randomly 
chosen edge is proportional to kpk . In most case, we are 
interested in how many edges there are leaving such a 
vertex other than the one we arrived along, i.e., in the 
so-called excess degree, which is one less than the total 
degree of the vertex. In the configuration model, the 
excess degree has a distribution qk given by 



(k + l)pfc+l _ (fc + l)pfc+l 



(22) 



where z — kpk is, as before, the mean degree in the 
network. 

The second important point about the model is that 
the chance of finding a loop in a small component of the 
graph goes as The number of vertices in a non-giant 
component is ©(n" 1 ), and hence the probability of there 
being more than one path between any pair of vertices 
is also 0(n _1 ) for suitably well-behaved degree distribu- 
tions. 22 This property is crucial to the solution of the 
configuration model, but is definitely not true of most 
real- world networks (see Sec. IIILB|) . It is an open ques- 
tion how much the predictions of the model would change 
if we were able to incorporate the true loop structure of 
real networks into it. 

We now proceed by defining two generating functions 



Each possible graph can be generated FJ i fc^ ! different ways, since 
the stubs around each vertex are indistinguishable. This factor 
is a constant for a given degree sequence and hence each graph 
appears with equal probability. 

An alter nat ive model has recently been proposed by Chung and 
Lu l88l 89]. In their model, each vertex i is assigned a de- 
sired degree fej chosen from the distribution of interest, and then 
m = ^JZi^i edges are placed between vertex pairs with 
probability proportional to This model has the disadvan- 

tage that the final degree sequence is not in general precisely 



equal to the desired degree sequence, but it has some significant 
calculational advantages that make the derivation of rigorous re- 
sults easier. It is also a logical generalization of the Poisson 
random graph, in a way that the configuration model is not. 
Similar approaches h ave also been taken by a number of other 
authors l7all2llT7ll. 

Using arguments similar to those leading to Eq. lijll . we can 
show that the density of loops in small components will tend to 
zero as graph size becomes large provided that 2 is finite and 
(fc 2 ) grows slower than n 1 / 2 . See also footnote 1251 
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for the distributions pk and qk'- 23 



Gi(x) 



fc=0 



E 

fc=0 



(23) 



Note that, using Eq. 122fl . we also find that G\(x) — 
G' a (x)/z, which is occasionally convenient. Then the 
generating function H\{x) for the total number of ver- 
tices reachable by following an edge satisfies the self- 
consistency condition 



Hi (x) =xGi(H 1 {x)). 



(24) 



This equation says that when we follow an edge, we find 
at least one vertex at the other end (the factor of x on 
the right-hand side) , plus some other clusters of vertices 
(each represented by Hi) which are reachable by follow- 
ing other edges attached to that one vertex. The num- 
ber of these other clusters is distributed according to q k , 
hence the appearance o f Gj . A detailed derivation of 
Eq. J21I) is given in Ref. 13231 

The total number of vertices reachable from a ran- 
domly chosen vertex, i.e., the size of the component to 
which such a vertex belongs, is generated by H$(x) where 



H (x) = xG (H 1 (x)). 



(25) 



The solution of Eqs. I|24|) and l|25|l gives us the entire 
distribution of component sizes. Mean component size 
below the phase transition in the region where there is 
no giant component is given by 



(a) = H' Q (1) = 1 



G' (l) 
l-Gi(l) 



= 1 



Z\ - Z2 



(26) 



where z\ — z = (k) = Gq(1) is the average number of 
neighbors of a vertex and Z2 = (k 2 ) — (k) = G' (l)G[(l) 
is the average number of second neighbors. We see that 
this diverges when z\ = Z2, or equivalently when 



Gi(l) = l. 



(27) 



This point marks the phase transition at which a gi- 
ant component first appears. Substituting Eq. (|23|) into 
Eq. H27(l . we can also write the condition for the phase 
transition as 



Kk - 2) Pk = 0. 



(28) 



rigorous deriv ation of this result has been given by Mol- 
loy and Reed [287j . 

Above the transition there is a giant component which 
occupies a fraction S of the graph. If we define u to be 
the probability that a randomly chosen edge leads to a 
vertex that is not a part of this giant component, then, 
by an argument precisely analogous to the one preceding 
Eq. I|2(J|) , this probability must satisfy the self-consistency 
condition u = G\{u) and S is given by the solution of 



5 = 1- G (u), 



Gi(«). 



(29) 



An equivalent result is derived in Ref. I288L Normally 
the equation for u cannot be solved in closed form, but 
once the generating functions are known a solution can 
be found to any desired level of accuracy by numerical it- 
eration. And once the value of S is known, the mean size 
of small components above the transition can be found 
by subtracting off the giant component and applying the 
arguments that led to Eq. (|26|l again, giving 



1 



[l-S][l-G[(u)Y 



(30) 



The result is a behavior qualitatively similar to that of 
the Poisson random graph, with a continuous phase tran- 
sition at a point defined by Eq. (|28|l , characterized by the 
appearance of a giant component and the divergence of 
the mean size of non-giant components. The ratio zijz\ 
of the mean number of vertices two steps away to the 
number one step away plays the role of the independent 
parameter governing the transition, as the mean degree z 
does in the Poisson case, and one can again define critical 
exponents for the transition, which take the same values 
as for the Poisson case, /3 = 7 = 1, r = |. 

We can also find an expression for the clustering co- 
efficient, Eq. ©, of th e configu ration model. A simple 
calculation shows that 136, 3191 
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(31) 



which is the value C = z/n for the Poisson random graph 
times an extra factor that depends on z and on the ratio 
(k 2 )/(k) 2 . Thus C will normally go to zero as rT 1 for 
large graphs, but for highly skewed degree distributions, 
like some of those in Fig. HO the factor of (k 2 ) / (k) 2 can 
be quite large, so that C is not necessarily negligible for 
the graph sizes seen in empirical studies of networks (see 
below) . 



Indeed, since this sum increases monotonically as edges 
are added to the graph, it follows that the giant compo- 
nent exists if and only if this sum is positive. A more 



23 Traditionally, the independent variable in a generating function 
is denoted 2, but here we use x to avoid confusion with the mean 
degree 2. 



2. Example: power-law degree distribution 

As an example of the application of these results, con- 
sider the much studied case of a network with a power-law 
degree distribution: 



Pk 





fc- Q /C(«) 



for k = 
for k > 1, 



(32) 
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for given constant a. Here C( a ) is the Riemann £- 
function, which functions as a normalizing constant. 
Substituting into Eq. ill'-'il) we find that 



Go Or) 



Lia(ar) 



Gi(x) 



Li Q _i(x) 



1) 



((a) ' iW x((a 

where Li„(x) is the nth polylogarithm of x. Then 
Eq. (|27Jl tells us that the phase transition occurs at the 
point 



C(a - 2) = 2C(a - 1) 



(34) 



which gives a critical value for a oia c — 3.4788 . . . Below 
this value a giant component exists; above it there is no 
giant component. For a < a c , the value of the variable u 
of Eq. (E3 is 



Liq-l(M) 

u((a- I)' 



(35) 



which gives u — below a — 2 and hence S = 1. Thus 
the giant component occupies the entire graph below this 
point, or more strictly, a randomly chosen vertex belongs 
to the giant component with probability 1 in the limit 
of large graph size (but see the following discussion of 
the clustering coefficient and footnote I25f) . In the range 
2 < a < a c we have a non-zero giant component whose 
size is given by Eq. (|29H . All of these results were first 
shown by Aiello et al. |8j. 

We can also calculate the clustering coefficient for the 
power-law case using Eq. l|3"T|) . For a < 3 we have (fc 2 ) ~ 
^niax * where fc max is the maximum degree in the network. 
Using Eq. (|T^|) for fc max , Eq. (f5T|) then gives 

C^n-^ (3=^—^. (36) 
a — 1 

This gives interesting behavior for the typical values 
2 < a < 3 of the exponent a seen in most networks 
(see Table ITT|) . If a > |, then C tends to zero as the 
graph becomes large, although it does so slower than the 
C ~ n^ 1 of the Poisson random graph provided a < 3. 
At a = |, C becomes constant (or logarithmic) in the 
graph size, and for a < | it actually increases with in- 
creasing system size. 24 Thus for scale-free networks with 
smaller exponents a, we would not be surprised to see 
quite substantial values of the clustering coefficient, even 
if the pattern of connections were completely random. 25 



24 For sufficiently large networks this implies that the clustering 
coefficient will be greater than 1. Physically this means that 
there will be more than one edge on average between two vertices 
that share a common neighbor. 

25 This means in fact that the generating function formalism breaks 
down for a < | , invalidating some of the preceding results for the 
power-law graph, since a fundamental assumption of the method 
is that there are no short loops in the network. Aiello et al. y| 
get around this problem by assuming that the degree distribution 
is cut off at fc max r~j n 1 /" (see Sec. 1111. C.2I . which gives C — » 
as n — * oo for all a > 2. This however is somewhat artificial; in 
real power-law networks there is normally no such cutoff. 



This mechanism can, for instance, account for m uch of 
the clustering seen in the World Wide Web |319j . 



(33) 3. Directed graphs 



Substantially more sophisticated extensions of random 
graph models are possible than the simple first exam- 
ple given above. In this and the next few sections we 
list some of the many possibilities, starting with directed 
graphs. 

Each vertex in a directed graph has both an in-degree j 
and an out-degree k, and the degree distribution there- 
fore becomes, in general, a double distribution pj^ over 
both degrees, as discussed in Sec. IIII.CI The generat- 
ing function for such a distribution is a function of two 
variables 



G(x,y) 



P 3 kx 3 y k . 



(37) 



Each vertex A also belongs to an in-component and an 
out- component, which are, respectively, the set of vertices 
from which A can be reached, and the set that can be 
reached from A, by following directed edges only in their 
forward direction. There is also the strongly connected 
component, which is the set of vertices which can both 
reach and be reached from A. In a random directed graph 
with a given degree distribution, the giant in, o ut, a nd 
strongly connected components can all be shown |323] to 
form at a single transition that takes place when 



^(2j/c - j - k) Pjk = 0. 

jk 



(38) 



Defining generating functions for in- and out-degree sep- 
arately and their excess-degree counterparts, 



F (x) = Q(x,l), Fl {x) = -^- 

z ay 

Go(v) =0(1,10, G 1 (y) = -f- 

z ox 



, (39a) 



(39b) 



the sizes of the giant out -, in-, an d strongly connected 
components are given by |l25l [323] 

S out = l-Fo(u), (40a) 
S in = 1 - G (v), (40b) 
Srtr = l-G{u,l)-g(l,v)+g{u,v), (40c) 



where 



4. Bipartite graphs 



Gi{v) 



(41) 



Another class of generalizations of random graph mod- 
els is to networks with more than one type of vertex. One 
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of the simplest and most important examples of such a 
network is the bipartite graph, which has two types of 
vertices and edges running only between vertices of un- 
like types. As discussed in Sec. II.AI many social networks 
are bipartite, forming what the sociologists call affiliation 
networks, i.e., networks of individuals joined by common 
membership of groups. In such networks the individ- 
uals and the groups are represented by the two vertex 
types with edges between the m represe nting group mem- 
bership. Networks of CEOs |l67l Il68| , boards of direc- 
tors |104 Il05l 12691. a nd collaborations of scientists |313| 
and film actors |416| are all examples of affiliation net- 
works. Some other ne twor ks, such as the railway network 
studied by Sen et al. |366j| . are also bipartite, and bipar- 
tite graphs have been u sed as th e basis for models of 
sexual contact networks |l44l l315| . 

Bipartite graphs have two degree distributions, one 
each for the two types of vertices. Since the total num- 
ber of edges attached to each type of vertex is the same, 
the means fj, and v of the two distributions are related 
to the numbers M and N of the types of vertices by 
fi/M — v/N . One can define generating functions as 
before for the two types of vertices, generating both the 
degree distribution and the excess degree distribution, 
and denoted fo(x), fi(x), go(x), and gi(x). Then for 
example we can show that there is a phase transition at 
which a giant component appears when /i(l)<?i(l) = 1- 
Expressions for the expected size of gia nt and non-giant 
components can easily be derived |323j . 

In many cases, graphs that are fundamentally bipar- 
tite are actually studied by projecting them down onto 
one set of vertices or the other — so called "one-mode" 
projections. For example, in the study of boards of di- 
rectors of companies, it has become standard to look at 
board "interlocks." Two boards are said to be inter- 
locked if they share one or more common members, and 
the graph of board interlocks is the one-mode projection 
of the full board graph onto the vertices representing just 
the boards. Many results for these one-mode projections 
can also be extracted from the generating function for- 
malism. To give one example, the projected networks 
do not have a vanishing clustering coefficient C in the 
limit of la rge system size, but instead can be shown to 
obey [HI 

i - 1 = (^2 -^i)(^2 - v\f 
C ii\vx{1vx - iv 2 +VzY 

where /i„ and v n are the nth moments of the degree dis- 
tributions of the two vertex types. 

More complicated types of network structure can be 
introduced by increasing the number of different types 
of vertices beyond two, and by relaxing the patterns of 
connection between vertex types. For example, one can 
define a model with the type of mixing matrix shown 
in Table II I II an d sol ve exactly for many of the standard 
properties 



5. Degree correlations 

The type of degree correlations discussed in Sec. IIII.FI 
can also be introduced into a random graph model |314| . 
Extending the formalism of Sec. IIII.EI we can define the 
probability distribution ejk to be the probability that a 
randomly chosen edge on a graph connects vertices of 
excess degrees j and k. On an undirected graph, this 
quantity is symmetric and satisfies 

J2 e ik = l, e 3k = Ik- (43) 

jk j 

Then the equivalent of Eq. f2U|l is 

S = l-Po-$>*«jU u j = ^A , (44) 



which must be solved self-consistently for the entire set 
{life} of quantities, one for each possible value of the 
excess degree. The phase transition at which a giant 
component appears takes place when det(I — m) = 0, 
where m is the matrix with elements rrijk — kejk/qj- 
Matrix conditions of this form appear to be the typical 
generalization of the criterion for the appearance of a 
giant co mponent t o graphs with non-trivial mixing pat- 
terns m, IHE Eq3 - 

Two other random graph models for degree correla- 
tions are also worth mentioning. One is the exponential 
random graph, which we study in more detail in the fol- 
lowing section. This is a general model, which has been 
applied to the particular problem of degree correlations 
by Berg and Lassig 48]. 

A more specialized model that aims to explain the de- 
gree anticorrelations seen in t he Internet has been put 
forward by Maslov et al. |275j |. They suggest that these 
anticorrelations are a simple result of the fact that the 
Internet graph has at most one edge between any ver- 
tex pair. Thus they are led to consider the ensemble of 
all networks with a given degree sequence and no dou- 
ble edges. (The configuration model, by contrast, allows 
double edges, and typical graphs usually have at least a 
few such edges, which would disqualify them from mem- 
bership in the ensemble of Maslov et al.) The ensemble 
with no duplic ate e dges, it turns out, is hard to treat 
analytically ^3, l407j , so Maslov et al. instead investigate 
it numerically, sampling the ensemble at random using a 
Monte Carlo algorithm. Their results appear to indicate 
that anticorrelations of the type seen in the Internet do 
indeed arise as a finite-size effect within this model. (An 
alternative explanation of the same observations has been 
put forward by Capocci et al. [8j|, who use a modified 
version of the model of Barabasi and Albert discussed in 
Sec. IVII.BI to show that correlations can arise through 
network growth processes.) 
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V. EXPONENTIAL RANDOM GRAPHS AND MARKOV 
GRAPHS 

The generalized random graph models of the previous 
sections effectively address one of the principal shortcom- 
ings of early network models such as the Poisson random 
graph, their unrealistic degree distribution. However, 
they have a serious shortcoming in that they fail to cap- 
ture the common phenomenon of transitivity described 
in Sec. IIII.BI The only solvable random graph models 
that currently incorporate transitivity are the bipartite 
and community-structured models of Sec. lIV.B.4l and cer- 
tain dual-graph models 345] , and these cover rather spe- 
cial cases. For general networks we currently have no 
idea how to incorporate transitivity into random graph 
models; the crucial property of independence between the 
neighbors of a vertex is destroyed by the presence of short 
loops in a network, invalidating all the techniques used 
to derive solutions. So me a pproximate methods may be 
useful in limited ways |317| or perhaps some sort of per- 
turbative analysis will prove possible, but no progress has 
yet been made in this direction. 

The main hope for progress in understanding the 
effects of transitivity, which are certainly substantial, 
seems to lie in formulating a completely different model 
or models, based around some alternative ensemble of 
graph structures. In this and the following section we 
describe two candidate mod els, the Mar kov graph s of 
Holland and Leinhardt |194| and Strauss llBfl |385j| and 
the small- world model of Watts and Strogatz 416]. 

Strauss 385] considers exponential random gr aphs, als o 
(in a slightly generalized form) called p* models [22ll410| . 
which are a class of graph ensembles of fixed vertex num- 
ber n defined by analogy with the Boltzmann ensemble of 
statistical mechanics. 26 Let {e^} be a set of measurable 
properties of a single graph, such as the number of edges, 
the number of vertices of given degree, or the number of 
triangles of edges in the graph. These quantities play a 
role similar to energy in statistical mechanics. And let 
{/3i} be a set of inverse-temperature or field parameters, 
whose values we are free to choose. We then define the 
exponential random graph model to be the set of all pos- 
sible graphs (undirected in the simplest case) of n vertices 
in which each graph G appears with probability 



P(G) = |exp(-^ft, 



where the partition function Z is 



(45) 



Z = ^exp(-^/3 ie 



(46) 



26 Indeed, in a development typical of this highly interdisciplinary 
field, exponential random graphs have recently been rediscov- 



For a sufficiently large set of temperature parameters 
{f3i}, this definition can encompass any probability distri- 
bution over graphs that we desire, although its practical 
application requires that the size of the set be limited to 
a reasonably small number. 

The calculation of the ensemble average of a graph 
observable ei is then found by taking a suitable derivative 
of the (reduced) free energy f = — log Z: 



(ei) - 5>(G)P(G) = I^eW-^fte, 

G G ^ i 

9/V 



(47) 



Thus, the free energy is a generating function for the ex- 
pectation values of the observables, in a manner familiar 
from statistical field theory. If a particular observable 
of interest does not appear in the exponent of 1451) (the 
"graph Hamiltonian" ) , then one can simply introduce it, 
with a corresponding temperature (3i which is set to zero. 

While these preliminary developments appear elegant 
in principle, little real progress has been made. One 
would like to find the appropriate Gaussian field the- 
ory for which / can be expressed in closed form, and 
then perturb around it to derive a diagrammatic expan- 
sion for the effects of higher-order graph operators. In 
fact, one can show that the Feynman diagrams for the 
expansion are the networks themselves. Unfortunately, 
carrying through the entire field-theoretic program has 
not proved e asy . The general approach one should take 
is clear 0, Ll3- but the mechanics appear intractable 
for most cases of interest. Some progress can be made by 
restricting ourselves to Markov graphs, which are the sub- 
set of graphs in which the presence or absence of an edge 
between two vertices in the graph is correlated only with 
those edges that share one of the same two vertices — 
edge pairs that are disjoint (have no vertices in common) 
are uncorrelated. Overall however, the question of how 
to carry out calculations in exponential random graph 
ensembles is an open one. 

In the absence of analytic progress on the model, there- 
fore, researchers have turned to Monte Carlo simulation, 
a technique to which the exponential random graph lends 
itself admirably. Once the values of the parameters {Pi} 
are specified, the form (|45|l of P(G) makes generation 
of graphs correctly sampled from the ensemble straight- 
forward using a Metropolis-Hastings type Markov chain 
method. One defines an ergodic move-set in the space 
of graphs with given n, and then repeatedly generates 
moves from this set, accepting them with probability 



P { P{G')/P{G) 



if P{G') > P{G) 
otherwise, 



(48) 



ered, apparently quite independently, by physicists 48, 7' 



and rejecting them with probability 1 — p, where G' is 
the graph after performance of the move. Because of 
the particular form, Eq. I|45|l . assumed for P(G), this 
acceptance probability is particularly simple to calculate: 
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^=exp(-5>K-e i ]). (49) 

This expression is independent of the value of the parti- 
tion function and its evaluation involves calculating only 
the differences t\ — e; of the energy-like graph proper- 
ties €i, which for local move-sets and local properties 
can often be accomplished in time independent of graph 
size. Suitable move-sets are: (a) addition and removal of 
edges between randomly chosen vertex pairs for the case 
of variable edge numbers; (b) movement of edges ran- 
domly from one place to another for the case of fixed edge 
numbers but variable degree sequence; (c) edge swaps 
of the form {(vi, Wi), (v 2 , w 2 )} — > {(vi,v 2 ), (u>i, w 2 )} for 
the case of fixed degree sequence, where (v±, w±) denotes 
an edge from vertex V\ to vertex W\. Monte Carlo al- 
gorithms of this type are straightforward to implement 
and appear to converge quickly allowing us to study quite 
large graphs. 

There is however, one unfortunate pathology of the 
exponential random graph that plagues numerical work, 
and particularly affects Markov graphs as they are used 
to model transitivity. If, for example, we include a term 
in the graph Hamiltonian that is linear in the number 
of triangles in the graph, with an accompanying positive 
temperature favoring these triangles, then the model has 
a tendency to "condense," forming regions of the graph 
that are essentially complete cliques — subsets of vertices 
within which every possible edge exists. It is easy to 
see why the model shows this behavior: cliques have the 
largest number of triangles for the number of edges they 
contain, and are therefore highly energetically favored, 
while costing the system a minimum in entropy by virtue 
of leaving the largest possible number of other edges free 
to contribute to the (presumably extensive) entropy of 
the rest of the graph. Networks in the real world however 
do not seem to have this sort of "clumpy" transitivity — 
regions of cliquishncss contributing heavily to the clus- 
tering coefficient, separated by other regions with few 
triangles. It is not clear how this problem is to be cir- 
cumvented, although for higher temperatures (lower val- 
ues of the parameters {Pi}) it is less problematic, since 
higher temperatures favor entropy over energy. 

Another area in which some progress has been made is 
in techniques for extracting appropriate values for the 
temperature parameters in the model from real-world 
network data. Procedures for doing this have been partic- 
ularly important for social network applications. Param- 
eters so extracted can be fed back into the Monte Carlo 
graph generation methods described above to generate 
model graphs which have similar statistical properties to 
their real-world counterparts and which can be used for 
hypothesis testing or as a substrate for further network 
simulations. Reviews of p ara mete r extraction techniques 
can be found in Refs. |H and HH 



VI. THE SMALL-WORLD MODEL 

A less sophisticated but more tractable model of a 
network with high transitivity is the small-world model 
proposed by Watts a nd Str ogatz [iTll lill liT^|. 27 As 
touched upon in Sec. IIII.EI networks may have a geo- 
graphical component to them; the vertices of the network 
have positions in space and in many cases it is reasonable 
to assume that geographical proximity will play a role in 
deciding which vertices are connected to which others. 
The small-world model starts from this idea by positing 
a network built on a low-dimensional regular lattice and 
then adding or moving edges to create a low density of 
"shortcuts" that join remote parts of the lattice to one 
another. 

Small-world models can be built on lattices of any di- 
mension or topology, but the best studied case by far is 
one-dimensional one. If we take a one-dimensional lattice 
of L vertices with periodic boundary conditions, i.e., a 
ring, and join each vertex to its neighbors k or fewer lat- 
tice spacings away, we get a system like Fig. lllb . with Lk 
edges. The small-world model is then created by taking 
a small fraction of the edges in this graph and "rewiring" 
them. The rewiring procedure involves going through 
each edge in turn and, with probability p, moving one 
end of that edge to a new location chosen uniformly at 
random from the lattice, except that no double edges or 
self-edges are ever created. This process is illustrated in 

The rewiring process allows the small-world model 
to interpolate between a regular lattice and something 
which is similar, though not identical (see below), to a 
random graph. When p = 0, we have a regular lattice. 
It is not hard to show that the clustering coefficient of 
this regular lattice is C = (3k — 3)/ (4fc — 2), which tends 
to | for large k. The regular lattice, however, does not 
show the small- world effect. Mean geodesic distances be- 
tween vertices tend to L/4k for large L. When p = 1, 
every edge is rewired to a new random location and the 
graph is almost a random graph, with typical geodesic 
distances on the order of log Lj log k, but very low clus- 
tering C ~ 2k /L (see Sec. II V. All . As Watts and Stro- 
gatz showed by numerical simulation, however, there ex- 
ists a sizable region in between these two extremes for 
which the model has both low path lengths and high 
transitivity — see Fig. 1121 

The original model proposed by Watts and Strogatz is 
somewhat baroque. The fact that only one end of each 
chosen edge is rewired, not both, that no vertex is ever 
connected to itself, and that an edge is never added be- 
tween vertex pairs where there is already one, makes it 
quite difficult to enumerate or average over the ensemble 



An equivalent model was proposed by Ball et al. |2S| some years 
earlier, as a model of the spread of disease between households, 
but appears not to have been widely adopted. 



28 



The structure and function of complex networks 




FIG. 11 (a) A one-dimensional lattice with c onnection s between all vertex pairs separated by k or fewer lattice spacing, with 
k — 3 in this case, (b) The small- world model |412U416|| is created by choosing at random a fraction p of the edges in the graph 
and moving one end of each to a new location, also chosen uniformly at random, (c) A slight variation on the model |289l . l324| 
in which shortcuts are added randomly between vertices, but no edges are removed from the underlying one-dimensional lattice. 



of graphs. For the purposes of mathematical treatment, 
the model can be simplified considerably by rewiring both 
ends of each chosen edge, and by allowing both double 
and self edges. This results in a system that genuinely in- 
terpolates between a regular lattice and a random graph. 
Another variant of the model that has bec ome popular 
was proposed indepe nden tly by Monasson |289| and by 
Newman and Watts 324]. In this variant, no edges are 
rewired. Instead "shortcuts" joining randomly chosen 
vertex pairs are added to the low-dimensional lattice — 
see Fig. II lb . The parameter p governing the density of 
these shortcuts is defined so as to make it as similar as 
possible to the parameter p in the first version of the 
model: p is defined as the probability per edge on the 
underlying lattice, of there being a shortcut anywhere in 
the graph. Thus the mean total number of shortcuts is 
Lkp and the mean degree is 2Lk{\ + p). This version 



0.5 
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clustering coefficient 




0.001 
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FIG. 12 The clustering coefficient C and mean vertex-vertex 
dista nce I in the small-world model of Watts and Stro- 
eatz as a function of the rewiring probability p. For 

convenience, both C and I are divided by their maximum val- 
ues, which they assume when p = 0. Between the extremes 
p = and p = 1, there is a region in which clustering is high 
and mean vertex-vertex distance is simultaneously low. 



of the model has the desirable property that no vertices 
ever become disconnected from the rest of the network, 
and hence the mean vertex-vertex distance is always for- 
mally finite. Both this version and the original have been 
studied at som e length in the mathematical and physical 
literature 3091. 



A. Clustering coefficient 

The clustering coefficient for both versions of the small- 
world model can be calculated relatively easily. For the 
original version, Barrat and Weigt 01 showed that 



C 



3(fc - 1) 
2(2fc- 1) 



(50) 



while for the version without rewiring, Newman |31 
showed that 



C = 



3(fc-l) 



2{2k- l) + 4kp(p + 2)' 



(51) 



B. Degree distribution 

The degree distribution of the small- world model does 
not match most real-world networks very well, although 
this is not surprising, since this was not a goal of the 
model in the first place. For the version without rewiring, 
each vertex has degree at least 2k, for the edges of the 
underlying regular lattice, plus a binomially distributed 
number of shortcuts. Hence the probability pj of having 
degree j is 



Pj 



G - 2fc) 


'2kp' 


j -2k 


1 - 


2kp~ 


L—j+2k 




L 




~L 





(52) 



for k > 2k, and Pj = for j < 2k. For the rewired 
version of the model, the distribution has a lower cutoff 
at k rather than 2k, and is rather more complicated. The 
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full expression is |4] 
'k 



Pi 



E 

n=0 



j—k—n 

;/ / (j — k — n) ! 



(53) 



for j > fc, and = for j < k. 



C. Average path length 

By far the most attention has been focused on the av- 
erage geodesic path length of the small- world model. We 
denote this quantity I. We do not have any exact solution 
for the value of £ yet, but a number of partial exact re- 
sults are known, including scaling forms, as well as some 
approximate solutions for its behavior as a function of 
the model's parameters. 

In the limit p — ► 0, the model is a "large world" — 
the typical path length tends to I = L/Ak, as dis- 
cussed above. Small-world behavior, by contrast, is typ- 
ically characterized by logarithmic scaling I ~ log L (see 
Sec. IIILAfl . which we see for large p, where the model 
becomes like a random graph. In between these two lim- 
its there is presumably some sort of crossover from large- 
to small-world behavior. Barthelemy and Amaral | 42| 
conjectured that £ satisfies a scaling relation of the form 



(54) 



The scaling form l|56|) shows that we can go from the 
large-world regime to the small-world one either by in- 
creasing p or by increasing the system size L. Indeed, the 
crucial scaling variable Lkp that appears as the argument 
of the scaling function is simply equal to the mean num- 
ber of shortcuts in the model, and hence £ as a fraction 
of system size depends only on how many shortcuts there 
are, for given k. 

Making any further progress has proved difficult. We 
would like to be able to calculate the scaling func- 
tion f{x), but this turns out not to be easy. The cal- 
culation is possible, though complicated, for a variant 
model in which there are no short cuts but ra ndom sites 
are connected to a single central "hub" vertex |115| . But 
for the normal small-world model no exact solution is 
known, although som e ad ditional exact scaling forms 
have been found [T^. [253]. Accurate numerical mea- 
surements have been car ried out f o r sy s tem sizes up to 
about L = 10 7 |M |H EM HI Ull HH and quit e 
good results can be derived using series expansions |325| . 
A mean-field treatment of the model has been given by 
Newman et al. |322j, which shows that fix) is approxi- 
mately 



/(*) 



2v^ 



2x 



: tanh 



(57) 



where £ is a correlation length that depends on p, and 
g{x) an unknown but universal scaling function that de- 
pends only on system dimension and lattice geometry, 
but not on L, £ or p. The variation of £ defines the 
crossover from large- to small-world behavior; the known 
behavior of £ for small and large L, can be reproduced 
by having £ diverge as p — > and 



x 

logo; 



for x > 1 
for x < 1. 



(55) 



Barthelemy and Amaral conjectured that £ diverges as 
£ ~ p~ T for small p, where r is a constant exponent. 
These conjectures have all turned out to be correct. 
Barthelemy and Amaral also conjectured on the basis 
of numerical results that r = | , which turned out not to 
be correct |M EH 13241. 



Equation H54fl has been shown to be co rrect by a renor- 
malization group treatment of the model |324| . From this 
treatment one can derive a scaling form for £ of 



£ = jf{Lkp), 



(56) 



which is equivalent to J5IJ, except for a factor of k, if £ = 
1/kp and g(x) = xf(x). Thus we immediately conclude 
that the exponent r defined by Barthelemy and Amaral 
is 1, as was also argued by Barrat |39j | using a mixture of 
scaling ideas and numerical simulation. 



and Barbour and Reinert |38| have further shown that 
this result is the leading order term in an expansion for £ 
that can be used to derive more accurate results for f(x). 

The primary use of the small-world model has been 
as a substrate for the investigation of various pro cesses 
taking pl ace on gra phs, such as percolation I294L 132 
3261 pa . coloring |38a Hop 
20 li |416| . iterated game s 
processes HM ITtI Blfil 
process es l28t 235. 255, 293 



cou 




led oscil lators |37 
416], diffusion 
29], epidemic 
and spin mod- 
els US 11911 12021 1256L I337l l429| " Some of this work is 
discussed further in Section lYlIII 

A few of variations of the small- world model have been 
proposed. Several authors hav e studied the model in di- 
mension higher than one [Tol HiS EH HH l32r|— the 
results are qualitatively similar to the one-dimensional 
case and follow the expected scaling laws. Various au- 
thors have also studied models in which shortcuts prefer- 
entially join vertices that are close to gether on the under- 
lying lattice pl l I23ll239ll307l 13 651. Of particular note 
is the work by Kleinberg which is discussed in 

Sec. I VIII. C .31 Roz enfeld et al. |3_5jj and independently 
Warren et al. [408] have studied models in which there 
are only shortcuts and no underlying lattice, but the sig- 
nature of the lattice still remains, guiding shortcuts to 
fall with higher probability between more closely spaced 
vertices (see Sec. IVIII.A|) . 
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VII. MODELS OF NETWORK GROWTH 

All of the models discussed so far take observed prop- 
erties of real- world networks, such as degree sequences 
or transitivity, and attempt to create networks that in- 
corporate those properties. The models do not however 
help us to understand how networks come to have those 
properties in the first place. In this section we exam- 
ine a class of models whose primary goal is to explain 
network properties. In these models, the networks typi- 
cally grow by the gradual addition of vertices and edges 
in some manner intended to reflect growth processes that 
might be taking place on the real networks, and it is these 
growth processes that lead to the characteristic structural 
features of the network. 28 For example, a number o f au- 
thors |^,IlQ^|l2M|2l3|223,|24^|223-l22^|4li|4^ have 
studied models of network transitivity that make use of 
"triadic closure" processes. In these models, edges are 
added to the network preferentially between pairs of ver- 
tices that have another third vertex as a common neigh- 
bor. In other words, edges are added so as to complete 
triangles, thereby increasing the denominator in Eq. @ 
and so increasing the amount of transitivity in the net- 
work. (There is some empirical evidence fro m co llabora- 
tion networks in support of this mechanism 310].) 

But the best studied class of network growth models 
by far, and the class on which we concentrate primarily in 
this section, is the class of models aimed at explaining the 
origin of the highly skewed degree distributions discussed 
in Sec. IIII.CI Indeed these models are some of the best 
studied in the whole of the networks literature, having 
been the subject of an extraordinary number of papers 
in the last few years. In this sect ion we describe first 
the archetypal model of Price l344| . which was based in 
turn on previous work by Simon 370J. Then we describe 
the highly influential model of Barabasi and Albert [3^| , 
which has been the driving force behind much of the re- 
cent work in this area. We also describe a number of 
variations and generalizations of these models due to a 
variety of authors. 



A. Price's model 

As discussed in Sec. IIII.CI the physicist-turned- 
historian-of-science Derek de Solla Price described in 
1965 probably the first example of what would now be 
called a scale-free network; he studied the network of ci- 
tations between scientific papers and found that both in- 
and out-degrees (number of times a paper has been cited 
and number of other papers a paper cites) have power-law 



An alternative and intriguing idea, which has so far not been in- 
vestigated in much depth, is that features such as power-law de- 
gree distributions may arise through network optimization. See, 
for instance, Refs. r29l Il56l Il66t l395l 14171 l41Sl 



distributions |343j . Apparently intrigued by the appear- 
ance of these po wer l aws, Price published another paper 
some years later |344j in which he offered what is now the 
accepted explanation for power-law degree distributions. 
Like many after him, his work bui lt on ideas developed in 
the 1950s by Herbert Simon [rVfll l370j | . who showed that 
power laws arise when "the rich get richer," when the 
amount you get goes up with the amount you already 
have . In sociology this is referred to as the Matthew ef- 
fect |282j |. after the biblical edict, "For to every one that 
hath shall be given. . ." (Matthew 25:29). 29 Price called 
it cumulative advantage. Today it is usually known un- 
der the name preferential attachment, coined by Barabasi 
and Albert pi ]. 

The important contribution of Price's work was to take 
the ideas of Simon and apply them to the growth of a net- 
work. Simon was thinking of wealth distributions in his 
early work, and although he later gave other applications 
of his ideas, none of them were to networked systems. 
Price appears to have been the first to discuss cumulative 
advantage specifically in the context of networks, and in 
particular in the context of the network of citations be- 
tween papers and its in-degree distribution. His idea was 
that the rate at which a paper gets new citations should 
be proportional to the number that it already has. This 
is easy to justify in a qualitative way. The probability 
that one comes across a particular paper whilst reading 
the literature will presumably increase with the number 
of other papers that cite it, and hence the probability 
that you cite it yourself in a paper that you write will 
increase similarly. The same argument can be applied 
to other networks also, such as the Web. It is not clear 
that the dependence of citation probability on previous 
citations need be strictly linear, but certainly this is the 
simplest assumption one could make and it is the one 
that Price, following Simon, adopts. We now describe in 
detail Price's model and his exact solution of it, which 
uses what we would now call a master- equation or rate- 
equation method. 

Consider a directed graph of n vertices, such as a ci- 
tation network. Let pu be the fraction of vertices in the 
network with in-degree k, so that ^2 k Pk = 1- New ver- 
tices are continually added to the network, though not 
necessarily at a constant rate. Each added vertex has a 
certain out-degree — the number of papers that it cites — 
and this out-degree is fixed permanently at the creation 
of the vertex. The out-degree may vary from one vertex 
to another, but the mean out-degree, which is denoted m, 



In fact, this is really only a half of the Matthew effect, since the 
same verse continues, ". . . but from him that hath not, that also 
which he seemeth to have shall be taken away." In the processes 
studied by Simon and Price nothing is taken away from anyone. 
The full Matthew effect, with both the giving and the taking 
away, corresponds more closely to the Polya urn process than to 
Price's cumulative advantage. Price points out this distinction 
in his paper l'344l . 
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is a constant over time. 30 (Certain conditions on the dis- 
tribution of to about the mean must hold; see for instance 
Ref . 1 1341 ) The value to is also the mean in-degree of the 
network: kp k — to. Since the out-degree can vary be- 
tween vertices, to can take non-integer values, including 
values less than 1. 

In the simplest form of cumulative advantage process 
the probability of attachment of one of our new edges to 
an old vertex — i.e., the probability that a newly appear- 
ing paper cites a previous paper — is simply proportional 
to the in-degree k of the old vertex. This however imme- 
diately gives us a problem, since each vertex starts with 
in-degree zero, and hence would forever have zero proba- 
bility of gaining new edges. To circumvent this problem, 
Price suggests that the probability of attachment to a 
vertex should be proportional to k + ko, where ko is a 
constant. Although he discusses the case of general ko, 
all his mathematical developments are for ko = 1, which 
he justifies for the citation network by saying that one 
can consider the initial publication of a paper to be its 
first citation (of itself by itself). Thus the probability of 
a new citation is proportional to k + 1. 

The probability that a new edge attaches to any of the 
vertices with degree k is thus 



Pk,- 



Pk , we then find 



(k + l)p k _(k + l) Pk 



1 



(58) 



The mean number of new citations per vertex added is 
simply to, and hence the mean number of new citations to 
vertices with current in-degree k is (k + l)pkm/ (to + 1). 
The number npk of vertices with in-degree k decreases 
by this amount, since the vertices that get new citations 
become vertices of degree k + 1. However, the number 
of vertices of in-degree k increases because of influx from 
the vertices previously of degree k — 1 that have also just 
acquired a new citation, except for vertices of degree zero, 
which have an influx of exactly 1. If we denote by pk, n 
the value of p k when the graph has n vertices, then the 
net change in npk per vertex added is 

(n + l)pk, n +i - np k ,n = [kpk-i, n ~{k+ l)pk,- 



for k > 1, or 



to + 1 
(59) 



(n + l)po,n+l - nP0,n = 1 ~PO,n- 



(60) 



for k = 0. Looking for stationary solutions Pk,n+i 



30 Elsewhere in this review we have used the letter z to denote mean 
degree. While it would make sense in many ways to use the same 
notation here, we have opted instead to change notation and 
use m because this is the notation used in most of the recent 
papers on growing networks. The reader should bear in mind 
therefore that m is not, as previously, the total number of edges 
in the graph. 



Pk 



[kpk- 



(k + l)pk\ m/ (m + 1) 



1 — pom/(m + 1) 



for k > 1, 
for k = 0. 

(61) 

Rearranging, we find po = (to + l)/(2m + 1) and pk — 
Pk-\k/{k + 2 + 1/to) or 



Pk = 



k(k- l)...l 



-Po 



{k + 2 + 1/to)... (3 + 1/to) 
= (l + l/m)B(fc + 1,2 + 1/to), (62) 

where B(a, b) = r(a)T(b) /T(a + b) is Legendre's beta- 
function, which goes asymptotically as a~ b for large a 
and fixed b, and hence 



Pk 



fe _(2+l/m)_ 



(63) 



In other words, in the limit of large n, the degree distri- 
bution has a power-law tail with exponent a = 2 + 1/to. 
This will typically give exponents in the interval between 
2 and 3, which is in agreement with the values seen in 
real- world networks — see Table [H] (Bear in mind that 
the mean degree to need not take an integer value, and 
can be less than 1.) Price gives a comparison between his 
model and citation network data from the Science Cita- 
tion Index, making a plausible case that the parameter to 
has about the right value to give the observed power-law 
citation distribution. 

Note that Price's assumption that the offset parameter 
ko = 1 can be justified a posteriori because the value of 
the exponent does not depend on ko. (This contrasts with 
the behavior of the model of Barabasi and Albert |32j . 
which is discussed in Sec. IVILCl ) The argument above 
is easily generalized to the case ko ^ 1, and we find that 



1 



Pk 



B(fc + fc ,2 + 1/to) 

TO(fc + l) + l B(fc ,2 + 1/TO) : 



(64) 



and hence a = 2 + 1/to a gain for l arge k and fixed ko. 
See Sec. IV11.CI and Refs. EH and HI for further dis- 
cussion of the effects of offset parameters. Thorough re- 
views of master-equation methods for grown graph mod- 
els have been given by Do rogov tsev and Mendes |120| 
and Krapivsky and Redner |248j . 

The analytic solution above was the extent of the 
progress Price was able to make in understanding his 
model network. Unlike present-day authors, for instance, 
he did not have computational resources available to sim- 
ulate the model, and so could give no numerical results. 
In recent years, a great deal more progress has been made 
in understanding cumulative advantage processes and the 
growth of networks. Most of this work has been carried 
out using a slightly different model, however, the model 
of Barabasi and Albert, which we now describe. 



B. The model of Barabasi and Albert 

The mec hanism of cumulative advantage proposed by 
Price 344] is now widely accepted as the probable ex- 
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planation for the power-law degree distribution observed 
not only in citation networks but in a wide variety of 
other networks also, including the World Wide Web, col- 
laboration networks, and possibly the Internet and other 
technological networks also. The work of Price himself, 
however, is largely unknown in the scientific community, 
and cumulative advantage did not achieve currency as 
a model of network growth until its rediscovery some 
decades later by Barabasi and Albert [33, who gave it 
the new name of preferential attachment. In a highly 
influential paper published — like Price's first paper on 
citation networks — in the journal Science, they proposed 
a network growth model of the Web that is very similar 
to Price's, but with one important difference. 



The model of Barabasi and Albert 32, 33] is the same 
as Price's in having vertices that are added to the net- 
work with degree m, which is never changed thereafter, 
the other end of each edge being attached to ( "citing" ) 
another vertex with probability proportional to the de- 
gree of that vertex. The difference between the two mod- 
els is that in the model of Barabasi and Albert edges are 
undirected, so there is no distinction between in- and out- 
degree. This has pros and cons. On the one hand, both 
citation networks and the Web are in reality directed 
graphs, so any undirected graph model is missing a cru- 
cial feature of these networks. On the other hand, by 
ignoring the directed nature of the network, the model of 
Barabasi and Albert gets around Price's problem of how a 
paper gets its first citation or a Web site gets its first link. 
Each vertex in the graph appears with initial degree m, 
and hence automatically has a non-zero probability of re- 
ceiving new links. (Note that for the model to be solvable 
using the master-equation approach as demonstrated be- 
low, the number of edges added with each vertex must 
be exactly m — it cannot vary around the mean value as 
in the model of Price. Hence it must also be an integer 
and must always have a value m > 1.) 

Another way of looking at the model of Barabasi and 
Albert is to say the network is directed, with edges go- 
ing from the vertex just added to the vertex that it is 
citing or linking to, but that the probability of attach- 
ment of a new edge is proportional to the sum of the in- 
and out-degrees of the vertex. This however is perhaps 
a less satisfactory viewpoint, since it is difficult to con- 
jure up a mechanism, either for citation networks or the 
Web, which would give rise to such an attachment pro- 
cess. Overall, perhaps the best way to look at the model 
of Barabasi and Albert is as a model that sacrifices some 
of the realism of Price's model in favor of simplicity. As 
we will see, the main result of this sacrifice is that the 
model produces only a single value a = 3 for the ex- 
ponent governing the degree distribution, although this 
has been remedied in later generalizations of the model, 
which we discuss in Sec. I VII. CI 

The model of Barabasi and Albert can be solved ex- 



actly in the limit of large graph size 31 using the master- 
equation method and such a solution has been given by 
Krapivsk y et al. |249j and independently by Dorogovt- 
sev et al. |123| . (Barabasi and Albert themselves gave an 
approximate solution based on the assumption that all 
vertices of the same age have the same degree [32, [33j . 
The method of Krapivsky et al. and Dorogovtsev et al. 
does not make this assumption.) 

The probability that a new edge attaches to a vertex 
of degree k — the equivalent of Eq. — is 



kpk 



T,k k Pk 



kpk 
2m' 



(65) 



The sum in the denominator is equal to the mean degree 
of the network, which is 2m, since there are m edges for 
each vertex added, and each edge, being now undirected, 
contributes two ends to the degrees of network vertices. 
Now the mean number of vertices of degree k that gain 
an edge when a single new vertex with m edges is added 
is m x kpk/2m = \kpk, independent of m. The num- 
ber npk of vertices with degree k thus decreases by this 
same amount, since the vertices that get new edges be- 
come vertices of degree k + 1. The number of vertices 
of degree k also increases because of influx from vertices 
previously of degree k — 1 that have also just acquired 
a new edge, except for vertices of degree m, which have 
an influx of exactly 1. If we denote by Pk, n the value of 
Pk when the graph has n vertices, then the net change in 
npk per vertex added is 

(n + l)p k ,n+i - npk.n = \{k - l)pfc-i,„ - \kpk, n , (66) 
for k > m, or 

(jl + l)p m ,n+l - np m ^n = 1 - \mp rn ^ n , (67) 

for k = m, and there are no vertices with k < m. 

Looking for stationary solutions pk.n+i = Pk.n = Pk as 
before, the equations equivalent to Eq. I|61|) for the model 



^ i(fc - l)p k -i ~ \kpk for k > m, ^ 



1 - ±mp„ 



for k = m. 



Rearranging for pk once again, w e find p m = 2/(m + 2) 
and Pk = Pk -i(k ~ 1)/(A + 2), or [THEa 

(k- l)(fc-2)...m 2m(m+l) 
Pk = {k + 2){k + l)...(m + 3) Pm = (fc + 2)(fc + l)fc" 

In the limit of large k this gives a power law degree 
distribution pk ~ fc~ 3 , with only the single fixed expo- 
nent a — 3. A more rigorous derivation of this result has 
been given by Bollobas et al. |65j. 



31 The behavior of the model at finit e sys tem sizes has been inves- 
tigated by Krapivsky and Redner [24(1 . 
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In addition to the basic solution of the model for its 
degree distribution, many other results are now known 
about t he m odel of Barabasi and Albert. Krapivsky and 
Redner |245j | have conducted a thorough analytic study 
of the model, showing among other things that the model 
has two important types of correlations. First, there is a 
correlation between the age of vertices and their degrees, 
with older vertices having higher mean degree. For the 
case m = 1, for instance, they find that the probability 
distribution of the degree of a vertex i with age a, mea- 
sured as the number of vertices added after vertex i, is 

Ma) = f^ n (l-f^) k . (70) 

Thus for specified age a the distribution is exponen- 
tial, with a characteristic degree scale that diverges as 
(1 — a/n)^ 1 / 2 as a — > n\ the earliest vertices added have 
substantially higher expected degree than those added 
later, and the overall power-law degree distribution of 
the whole graph is a result primarily of the influence of 
these earliest vertices. 

This correlation between degree and age has been used 
by Adamic and Huberman |4| to argue against the model 
as a model of the World Wide Web — they show using ac- 
tual Web data that there is no such correlation in the real 
Web. This does not mean that preferential attachment 
is not the explanation for power-law degree distributions 
in the Web, only that the dynamics of the Web must be 
more complicated than this simple model to account also 
for the observed age distribution 35J . An extension of the 
model that may explain why age and degree are not cor- 
related has been given by Bianconi and Barabasi [52l l53| 
and is discussed in Sec. IVII.CI 

Second, Krapivsky and Redner |245| show that there 
are correlations between the degrees of adjacent vertices 
in the model, of the type discussed in Sec. IIII.Fl Looking 
again at the special case m — 1, they show that the 
quantity ejk defined in Sec. IIV.B.5I which is the number 
of edges that connect vertex pairs with (excess) degrees 
j and k, is 



JK (k + l)(fc + 2)(j + k + 2)(j + k + 3)(j + k + 4) 

+ m . 

{k + + k + + k + 2)(j + k + 3)(j + k + 4) 

(71) 

Note that this quantity is asymmetric. This is because 
Krapivsky and Redner regard the network as being di- 
rected, with edges leading from the vertex just added 
to the pre-existing vertex to which they attach. In the 
expression above, however, j and k are total degrees of 
vertices, not in- and out-degree. 

Although (|71|l shows that the vertices of the model 
have non-trivial correlations, the correlation coefficient of 
the degrees of adjacent v ertic es in the network is asymp- 
totically zero as n — > oo |314| . This is because the corre- 



lation coefficient measures correlations relative to a linear 
model, and no such correlations are present in this case. 

One of the main advantages that we have today over 
early workers such as Price is the widespread availabil- 
ity of powerful computer resources. Quite a number of 
numerical studies have been performed of the model of 
Barabasi and Albert, which would have been entirely im- 
possible thirty years earlier. It is worth mentioning here 
how simulations of these types of models are conducted. 
We consider the Barabasi-Albert model. The exact same 
ideas can be applied to Price's model also. 

A naive simulation of the preferential attachment pro- 
cess is quite inefficient. In order to attach to a vertex in 
proportion to its degree we normally need to examine the 
degrees of all vertices in turn, a process that takes O(n) 
time for each step of the algorithm. Thus the generation 
of a graph of size n would take 0(n 2 ) steps overall. A 
much better procedure, which works in 0(1) time per 
step and 0(n) time overall, is the following. We main- 
tain a list, in an integer array for instance, that includes 
ki entries of value i for each vertex i. Thus, for exam- 
ple, a network of four vertices labeled 1, 2, 3, and 4 with 
degrees 2, 1, 1, and 3, respectively could be represented 
by the array (1,1,2,3,4,4,4). Then in order to choose 
a target vertex for a new edge with the correct preferen- 
tial attachment, one simply chooses a number at random 
from this list. Of course, the list must be updated as 
new vertices and edges are added, but this is simple. No- 
tice that there is no requirement that the items in the 
list be in any particular order. If we add a new vertex 5 
to our network above, for example, with degree 1 and 
one edge that connects it to vertex 2, the list can be up- 
dated by adding new items to the end, so that it reads 
(1,1,1,2,3,4,4,4,5,2). And so forth. Models such as 
Price's, in which there is an offset ko in the probability 
of selecting a vertex (so that the total probability goes as 
k + ko), can be treated with the same method — the off- 
set merely means that with some probability one chooses 
a vertex with preferential attachment and otherwise one 
chooses it uniformly from the set of all vertices. 

An alternative method for simulating the model of 
Barabasi an d Al bert has been described by Krapivsky 
and Redner |245| . Their method uses the network struc- 
ture itself in place of the list of vertices above and works 
as follows. The model is regarded as a directed network 
in which there are exactly m edges running out of each 
vertex, pointing to others. We first pick a vertex at 
random from the graph and then with some probabil- 
ity we either keep that vertex or we "redirect" to one 
of its neighbors, meaning that we pick at random one of 
the vertices it points to. Since each vertex has exactly 
m outgoing edges, the latter operation is equivalent to 
choosing an edge at random from the graph and following 
it, and hence alights on a target vertex with probability 
proportional to the in-degree j of that target (because 
there are j ways to arrive at a vertex of in-degree j — see 
Sec. lIV.B.lfl . Thus the total probability of selecting any 
given vertex is proportional to j + c, where c is some 
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constant. However, since the out-degree of all vertices is 
simply to, the total degree is k = j + m and the selection 
probability is therefore also proportional to fc + c — to. 
By choosing the probability of redirection appropriately, 
we can arrange for the constant c to be equal to to, and 
hence for the probability of selecting a vertex to be sim- 
ply proportional to k. Since it does not require an extra 
array for the vertex list, this method of simulation is more 
memory efficient than the previous method, although it 
is slightly more complicated to implement. 

In their original paper on their model, Barabasi and 
Albert |32j ] gave simulations showing the power-law dis- 
tribution of degrees. A number of authors have sub- 
sequently published more extensive simulation results. 
Of part icular no te is the work by Dorogovts ev an d 
Mendes |l!4l Ill6j and by Krapivsky and Redner |246| | . 

A crucial element of both the models of Price and of 
Barabasi and Albert is the assumption of linear preferen- 
tial attachment. It is worth asking whether there is any 
empirical evidence in support of this assumption. (We 
discuss in the next section some work on models that 
relax the linearity assumption.) Two studies indicate 
that it may be a reasonable approximation to the truth. 
Jeong et al. |213j looked at the time evolution of citation 
networks, the Internet, and actor and scientist collabo- 
ration networks, and measured the number of new edges 
a vertex acquires in a single year as a function of the 
number of previously existing edges. They found that 
the one quantity was roughly proportional to the other, 
and hence concluded that linear preferen tial attachment 
was at work in these networks. Newman ,3.1 Of performed 
a similar study for scientific collaboration networks, but 
with finer time resolution, measured by the publication 
of individual papers, and came to similar conclusions. 



C. Generalizations of the Barabasi-Albert model 

The model of Barabasi and Albert has attracted 
an exceptional amount of attention in the literature. In 
addition to analytic and numerical studies of the model 
itself, many authors have suggested extensions or modi- 
fications of the model that alter its behavior or make it a 
more realistic representation of processes taking place in 
real- world networks. We discuss a few of these here. A 
more extensive review of developments in this area has 
been given by Albert and Barabasi 0] (see particularly 
Table III in that paper). 

D orog ovtsev et al. |l23j and Krapivsky and Red- 
ner [245| have examined the model in which the prob- 
ability of attachment to a vertex of degree k is propor- 
tional to k + ko, where the offset fco is a constant. Note 
that ko is allowed to be negative — it can fall anywhere in 
the range —to < fc < oo and the probability of attach- 
ment will be positive. The equations for the stationary 
state of the degree distribution of this model, analogous 



to Eq. (JBSJ, are 

_ / [(^ — l)pfc— 1 — kpk] mj (2m + fco) for k > to, 
t 1 — p m m 2 / (2m + ko) for fc = to. 

(72) 

which gives p m = (2m + ko)/(m 2 + 2m + ko) and 

(fc — 1) ... TO 

Pk = (fc + 2 + k Q /m) . . . (m + 3 + k /m) Pm 
B(fc,3 + fc /TO) 

B(to, 2 + fco/ to) ' 1 ' 

where B(a, b) = T(a)T(b)/T(a + b) is again the Legendre 
beta-function. This gives a power law for large fc once 
more, with exponent a = 3 + ko/m. It is proposed that 
negative values of fco could be the explanation for the 
values a < 3 seen in real- world networks. 32 A longer 
disc ussio n of the effects of offset parameters is given in 
Ref . 1245 

Krapivsky et al. |245l l249j also consider another im- 
portant generalization of the model, to the case where 
the probability of attachment to a vertex is not linear 
in the degree fc of the vertex, but goes instead as some 
general power of degree fc 7 . Again this model is solvable 
using methods similar to those above, and the authors 
find three general classes of behavior. For 7=1 exactly, 
we recover the normal linear preferential attachment and 
power-law degree sequences. For 7 < 1, the degree distri- 
bution is a power law multiplied by a stretched exponen- 
tial, whose exponent is a complicated function of 7. (In 
fact, in most cases there is no known analytic solution 
for the equations governing the exponent; they must be 
solved numerically.) For 7 > 1 there is a "condensation" 
phenomenon, in which a single vertex gets a finite frac- 
tion of all the connections in the network, and for 7 > 2 
there is a non-zero probability that this "gel node" will 
be connected to every other vertex on the graph. The 
remainder of the vertices have an exponentially decaying 
degree distribution. 

Another variation on the basic growing network theme 
is to make the mean degree change over time. There is 
evidence to suggest that in the World Wide Web the aver- 
age degree of a vertex is increasing with time, i.e., the pa- 
rameter m appearing in the m odels is increasing. Doro- 
govtsev and Mendes |l!8l Il2l| have studied a variation 
of the Barabasi-Albert model that incorporates this pro- 
cess. They assume that the number m of new edges 
added per new vertex increases with network size n as 
n a for some constant a, and that the probability of at- 
taching to a given vertex goes as fc + Bn a for constant B. 
They show that the resulting degree distribution follows 
a power law with exponent a = 2 + B(l + a) /(I — Ba). 



Price's result a = 2 + 1/m |344| corresponds to fco = — (jti — 1) 
so that the "attractiveness" of a new vertex is 1. The model of 
Barabasi and Albert corresponds to fco = 0, so that a = 3. 
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(Note that when a — 0, this model redu ces t o the model 
studied previously by Dorogovtsev et al. |l23j , but the ex- 
pression for a given here is not valid in this limit.) Thus 
this process offers another possible mechanism by which 
the exponent of the degree distribution can be tuned to 
match that observed in real-world networks. 

In Price's model of citation networks, no new out-going 
edges are added to a vertex after its first appearance, and 
edges once added to the graph remain where they are 
forever. This makes sense for citation networks. But the 
model of Barabasi and Albert is intended to be a model of 
the World Wide Web, in which new links are often added 
to pre-existing Web sites, and old links are frequently 
moved or removed. A number of authors have proposed 
models that incorporate processe s lik e these. In par- 
ticular, Dorogovtsev and Mendes |116| have proposed a 
model that adds to the standard Barabasi- Albert model 
an extra mechanism whereby edges appear and disappear 
between pre-existing vertices with stochastically constant 
but possibly different rates. They find that over a wide 
range of values of the rates the power-law degree distri- 
bution is maintained, although again the exponent varies 
from the val ue — 3 seen in the original model. Krapivsky 
and Redner |247j have also proposed a model that allows 
edges to be added after vertices are created, which we 
discus s in the ne xt section. Albert and Barabasi ^3 an( i 
Tadic |39lL l392| have studied models in which edges can 
move around the network after they are added. These 
models can show both power-law and exponential degree 
distributions depending on the model parameters. 

As discussed in Sec. lVILBl Adamic and Huberman Q 
have shown that the real World Wide Web does not have 
the correlations between age and degree of vertices that 
are found in the model of Barabasi and Albert. Adamic 
and Huberman suggest that this is because the degree of 
vertices is also a function of their intrinsic worth; some 
Web sites are useful to more people than others and so 
gain links at a higher rate. Bianconi and Barabasi [52LI53]] 
have proposed an extension of the Barabasi- Albert model 
that mimics this process. In their model each newly ap- 
pearing vertex i is given a "fitness" r\i that represents 
its attractiveness and hence its propensity to accrue new 
links. Fitnesses are chosen from some distribution p{rf) 
and links attach to vertices with probability proportional 
now not just to the degree ki of vertex i but to the prod- 
uct rjiki. 

Depending on the form of the distribution pin ) this 
model shows two regimes of behavior |52l |247| . If 
the distribution has finite support, then the network 
shows a power-law degree distribution, as in the origi- 
nal Barabasi- Albert model. However, if the distribution 
has infinite support, then the one vertex with the high- 
est fitness accrues a finite fraction of all the edges in the 
network — a sort of "winner takes all" phenomenon, which 
Bianconi and Barabasi liken to monopoly dominance of 
a market. 

A number of variations on the fitne ss theme have been 
studied by Ergiin and Rodgers |145| . who looked at a 



directed version of the Bianconi-Barabasi model and 
at models where instead of multiplying the attachment 
probability, the fitness r\i contributes additively to the 
probability of attaching a new edge to vertex i. Treat- 
ing the models analytically, they found in each case that 
for suitable parameter values the power-law degree dis- 
tribution is preserved, although again the exponent may 
be affected by the distribution of fitnesses, and in some 
cases there are also logarithmic corrections to the degree 
distribution. A model with vertex fitness but no preferen- 
tial attachment has been studied by Caldarelli et al. [7^] , 
and also gives power-law degree distributions under some 
circumstances. 



D. Other growth models 

The model of Barabasi and Albert [32] is elegant and 
simple, but lacks a number of features that are present 
in the real World Wide Web: 

• The model is a model of an undirected network, 
where the real Web is directed. 

• As mentioned previously one can regard the model 
as a model of a directed network, but in that 
case attachment is in proportion to the sum of in- 
and out-degrees of a vertex, which is unrealistic — 
presumably attachment should be in proportion to 
in-degree only, as in the model of Price. 

• If we regard the model as producing a directed 
network, then it generates acyclic graphs (see 
Sec. II. A|) . which are a poor representation of the 
Web. 

• All vertices in the model belong to a single con- 
nected component (a weakly connected component 
if the graph is regarded as directed — the graph has 
no strongly connected components because it is 
acyclic). In the real Web there are many separate 
components (and strongly connected components). 

• The out-degree distribution of the Web follows a 
power law, whereas out-degree is a constant in the 
model. 33 



What's more, although it is rarely pointed out, it is clearly the 
case that a different mechanism must be responsible for the out- 
degree distribution from the one responsible for the in-degree 
distribution. We can justify preferential attachment for in-degree 
by saying that Web sites are easier to find if they have more links 
to them, and hence they get more new links because people find 
them. No such argument applies for out-degree. It is usually 
assumed that out-degree is subject to preferential attachment 
nonetheless. One can certainly argue that sites with many out- 
going links are more likely to add new ones in the future than 
sites with few, but it's far from clear that this must be the case. 
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Many of these criticisms are also true of Price's model, 
but Price's model is intended to be a model of a citation 
network and citation networks really are directed, acyclic, 
and to a good approximation all vertices belong to a sin- 
gle component, unless they cite and are cited by no one 
else at all. Thus Price's model is, within its own lim- 
ited sphere, a reasonable one. For the World Wide Web 
a number of authors have suggested new growth models 
that address one or more of the concerns above. Here we 
describe a number of these models, starting with some 
very simple ones and working up to the more complex. 

Consider first the issue of the component structure of 
the network. In the models of Price and of Barabasi 
and Albert each vertex joins to at least one other when 
it first appears. It follows trivially then that, so long 
as no edges are ever removed, all vertices belong to a 
single (weakly-connected) component. This is not true 
in the real Web. How can we get around it? To address 
this question Callaway et al. [S3 proposed the following 
extremely simple model of a growing network. Vertices 
are added to the network one by one as before, and a 
mean number m of undirected edges are added with each 
vertex. As with Price's model, the value of m is only an 
average — the actual number of edges added per step can 
vary — and so m is not restricted to integer values, and 
indeed we will see that the interesting behavior of the 
model takes place at values m < 1. 

The important difference between this model and the 
previous models is that edges are not, in general, at- 
tached to the vertex that has just been added. Instead, 
both ends of each edge are attached to vertices chosen 
uniformly at random from the whole graph, without pref- 
erential attachment. Vertices therefore normally have 
degree zero when they are first added to the graph. Be- 
cause of the lack of preferential attachment this model 
does not show power-law degree distributions — in fact 
the degree distribution can be show to be exponential — 
but it does have an interesting component structure. A 
related model has been studied, albeit to somewhat dif- 
ferent purpose, by Aldous and Pittel 0. Their model 
is equivalent to the model of Callaway et al. in the case 
m = 1. Also Bauer and collaborators [Zil Il0dj| have in- 
vestigated a directed-graph version of the model. 

Initially, one might imagine that the model of Calla- 
way et al. generated an ordinary Poisson random graph 
of the Erdos-Renyi type. Further reflection reveals how- 
ever that this is not the case; older vertices in the network 
will tend to be connected to one another, so the network 
has a cliquish core of old-timers surrounded by a sea of 
younger vertices. Nonetheless, like the Poisson random 
graph, the model does have many separate components, 
with a phase transition at a finite value of m at which a 
giant component appears that occupies a fixed fraction 
of the volume of the network asn->oo. To demonstrate 
this, Callaway et al. used a master-equation approach 
similar to that used for degree distributions in the pre- 
ceding sections. One defines p s to be the probability 
that a randomly chosen vertex belongs to a component 



of s vertices, and writes difference equations that give the 
change in p s when a single vertex and m edges are added 
to the graph. Looking for stationary solutions, one then 
finds in the limit of large graph size that 



ms Y.\=\ PjPs-j - 2msp s 



1 



2mpi 



for s > 1 
for s = 1. 



(74) 



Being nonlinear in p s , these equations are harder to 
solve than those for the degree distributions in previ- 
ous sections, and indeed no exact solution has been 
found. Nonetheless, we can see that a giant compo- 
nent must form by defining a generating function for the 
component size distribution similar to that of Eq. i|25[l : 
H ( x ) = T,7=oPsX s - Then J7IJ> implies that 



dH 
dx 



1 

2m 



1 - H(x)/x 
1 - H(x) 



(75) 



If there is no giant component, then H(l) = 1 and the 
average component size is (s) = H'(l). Taking the limit 
x — > 1 in Eq. (|75|l . we find that (s) is a solution of the 



quadratic equation 2m(s) 



1 = 0, or 



f - VI - 8m 
4m 



(76) 



(The other solution to the quadratic gives a non-physical 
value.) This solution exists only up to m = i however, 
and hence above this point there must be a giant compo- 
nent. This doesn't tell us where in the interval < m < g 
the giant component appears, but a proof that the tran- 
sition in fact falls precisely at m = g was later given by 
Durrett fil^ . 

The model of Callaway et al. has been general- 
ized to i nclud e preferential attachment by Dorogovt- 
sev et al. |l24j | . In their version of the model both ends 
of each edge are attached in proportion to the degrees of 
vertices plus a constant offset to ensure that vertices of 
degree zero have a chance of receiving an edge. Again 
they find many components and a phase transition at 
nonzero m, and in addition the power-law degree distri- 
bution is now restored. 

Taking the process a step further, Krapivsky and Red- 
ner |247| studied a full directed-graph model in which 
both vertices and directed edges are added at stochasti- 
cally constant rates and the out-going end of each edge 
is attached to vertices in proportion to their out-degree 
and the in-going end in proportion to in-degree, plus ap- 
propriate constant offsets. This appears to be quite a 
reasonable model for the growth of the Web. It produces 
a directed graph, it allows edges to be added after the 
creation of a vertex, it allows for separate components 
in the graph, and, as Krapivsky and Redner showed, it 
gives power laws in both the in- and out-degree distri- 
butions, just as observed in the real Web. By varying 
the offset parameters for the in- and out-degree attach- 
ment mechanisms, one can even tune the exponents of 
the two distributions to agree with those observed in the 
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wild. (Krapivsky and Redner's model is a development 
of an earlier model that they proposed |250| that had all 
the same features, but gave rise to only a single weakly 
connected component because each added vertex came 
with one edge that attached it to the rest of the network 
from the outset. In their later paper, they abandoned 
this feature. A similar model has also been studied by 
Rodgers and Darby-Dowman (355J-) A slight variation 
on the model of Krapivsky and Redner has been pro- 
posed independently by Aiello et al. |9j , who give rigorous 
proofs of some of its properties. 



E. Vertex copying models 

There are some networks that appear to have power- 
law degree distributions, but for which preferential at- 
tachment is clearly not an appropriate model. Good ex- 
ample s are biochemical interaction networks of various 
kinds [TH El Ull |22l IIH Eo|. A number of stud- 
ies have been performed, for instance, of the interaction 
networks of proteins (see Sec. III.D|I in which the vertices 
are proteins and the edges represent reactions. These 
networks do change on very long time-scales because of 
biological evolution, but there is no reason to suppose 
that protein networks grow according to a simple cu- 
mulative advantage or preferential attachment process. 
Nonetheless, it appears that the degree distribution of 
these networks obeys a power law, at least roughly. 

A possible explanation for t his obser vation has been 
suggested by Kleinberg et al. |24ll |254| . who proposed 
that these networks grow, at least in part, by the copying 
of vertices. Kleinberg et al. were interested in the growth 
of the Web, for which their model is as follows. The graph 
grows by stochastically constant addition of vertices and 
addition of directed edges either randomly or by copying 
them from another vertex. Specifically, one chooses an 
existing vertex and a number m of edges to add to it, and 
one then decides the targets of those edges, by choosing 
at random another vertex and copying targets from m 
of its edges, randomly chosen. If the chosen vertex has 
less than m outgoing edges, then its m edges are copied 
and one moves on to another vertex and copies its edges, 
and so forth until m edges in total have been copied. In 
its most general form, the model of Kleinberg et al. also 
incorporates mechanisms for the removal of edges and 
vertices, which we do not describe here. 

It is straightforward to see that the copying mecha- 
nism will give rise to power-law distributions. The mean 
probability that an edge from a randomly chosen vertex 
will lead to a particular other vertex with in-degree k is 
proportional to k (see Sec. IIV.B.H . and hence the rate 
of increase of a vertex's degree is proportional to its cur- 
rent degree. As with the model of Price, this mecha- 
nism will never add new edges to vertices that currently 
have degree zero, so Kleinberg et al. also include a finite 
probability that the target of a newly added edge will be 
chosen at random, so that vertices with degree zero have 



a chance to gain edges. In their original paper, Klein- 
berg et al. present only numerical evidence that their 
model results in a power law degree distrib ution , but in 
a later paper a subset of the same authors |254j proved 
that the degree distribution is a power law with exponent 
a = (2 — a)/(l — a), where a is the ratio of the number 
of edges added whose targets are chosen at random to 
the number whose targets are copied from other vertices. 
For small values of a, between and |, i.e., for models 
in which most target selection is by copying, this pro- 
duces exponents 2 < a < 3, which is the range observed 
in most real- world networks — see Table ITT1 Some further 
analytic results for copying models have been given by 
Chung et al. [90| . 

It is not clear whether the copying mechanism really 
is at work in the growth of the World Wide Web, but 
there has been considerable interest in its application as 
a model of the evolution of protein interaction networks 
of one sort or another. The argument here is that the 
genes that code for proteins can and do, in the course of 
their evolutionary development, duplicate. That is, upon 
reproduction of an organism, two copies of a gene are er- 
roneously made where only one existed before. Since the 
proteins coded for by each copy are the same, their in- 
teractions are also the same, i.e., the new gene copies its 
edges in the interaction network from the old. Subse- 
quently, the two genes may dev elop differences because 
of evolutionary drift or selection |404j . Models of protein 
networks that make use of copying me chanisms have been 
proposed by a number of authors [43, l233t 13771 l399j . 

A variation on the idea of vertex copying appears 
in the autocatalytic network models of Jain and Kr- 
ishna in which a network of interacting chemi- 
cal species evolves by reproduction and mutation, giving 
rise ultimately to self-sustaining autocatalytic loops rem- 
iniscent of the "hypercycles" of Eigen and Schuster |140| , 
which have been proposed as a possible explanation of the 
origin of life. 



VIII. PROCESSES TAKING PLACE ON NETWORKS 

As discussed in the introduction, the ultimate goal of 
the study of the structure of networks is to understand 
and explain the workings of systems built upon those net- 
works. We would like, for instance, to understand how 
the topology of the World Wide Web affects Web surfing 
and search engines, how the structure of social networks 
affects the spread of information, how the structure of 
a food web affects population dynamics, and so forth. 
Thus, the next logical step after developing models of net- 
work structure, such as those described in the previous 
sections of this review, is to look at the behavior of mod- 
els of physical (or biological or social) processes going on 
on those networks. Progress on this front has been slower 
than progress on understanding network structure, per- 
haps because without a thorough understanding of struc- 
ture an understanding of the effects of that structure is 
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site percolation bond percolation 

FIG. 13 Site and bond percolation on a network. In site per- 
colation, vertices ("sites" in the physics parlance) are either 
occupied (solid circles) or unoccupied (open circles) and stud- 
ies focus on the shape and size of the contiguous clusters of 
occupied sites, of which there are three in this small exam- 
ple. In bond percolation, it is the edges ("bonds" in physics) 
that are occupied or not (black or gray lines) and the vertices 
that are connected together by occupied edges that form the 
clusters of interest. 

hard to come by. However, there have been some impor- 
tant advances made, particularly in the study of network 
failure, epidemic processes on networks, and constraint 
satisfaction problems. In this section we review what 
has been learned so far. 



A. Percolation theory and network resilience 

One of the first examples to be studied thoroughly of 
a process taking place on a network has been percolation 
processes, mostly simple site and bond percolation — see 
Fig. 1131 — although a number of variants have been stud- 
ied also. A percolation process is one in which vertices or 
edges on a graph are randomly designated either "occu- 
pied" or "unoccupied" and one asks about various prop- 
erties of the resulting patterns of vertices. One of the 
main motivations for the percolation model when it was 
first proposed in th e 1950s was the modeling of the spread 
of disease |j3> Il87j , and it is in this context also that it 
was first studie d in the current wave of interest in real- 
world networks |325j . We consider epidemiological appli- 
cations of percolation theory in Sec. IVIII.BI Here how- 
ever, we depart from the order of historical developments 
to discuss first a simpler application to the question of 
network resilience. 

As discussed in Sec. IIII.DI real-world networks are 
found often to be highly resilient to the random deletion 
of their vertices. Resilience can be measured in differ- 
ent ways, but perhaps the simplest indicator of resilience 
in a network is the variation (or lack of variation) in 
the fraction of vertices in the largest component of the 
network, which we equate with the giant component in 
our models (see Sec. HV.A"|) . If one is thinking of a com- 
munication network, for example, in which the existence 
of a connecting path between two vertices means that 
those two can communicate with one another, then the 
vertices in the giant component can communicate with 
an extensive fraction of the entire network, while those 
in the small components can communicate with only a 
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few others at most. Following the numerical studies of 
Broder et al. [74| and Albert et al. HEIl on subsets of the 
Web graph, it was quickly realized |Ul93| that the prob- 
lem of resilience to random failure of vertices in a network 
is equivalent to a site percolation process on the network. 
Vertices are randomly occupied (working) or unoccupied 
(failed), and the number of vertices remaining that can 
successfully communicate is precisely the giant compo- 
nent of the corresponding percolation model. 

A number of analytic results have been derived for per- 
colation on networks with the structure of the configu- 
ration model of Sec. IIV.B.1I i.e., a random graph with a 
given degree sequence. Cohen et al. [9JJ made the follow- 
ing simple argument. Suppose we have a configuration 
model with degree distribution pk- That is, a randomly 
chosen vertex has degree k with probability pk in the limit 
of large number n of vertices. Now suppose that only a 
fraction q of the vertices are "occupied," or functional, 
that fraction chosen uniformly at random from the en- 
tire graph. For a vertex with degree fc, the number k' of 
occupied vertices to which it is connected is distributed 
binomially so that the probability of having a particular 
value of k' is (^,)q k (1 — q) k ~ k , and hence the total prob- 
ability that a randomly chosen vertex is connected to k' 
other occupied vertices is 

^'=i>(?,V'(l-<7) fc - fc '- (77) 

k=k' ^ ' 

Since vertex failure is random and uncorrelated, the sub- 
set of all vertices that are occupied forms another another 
configuration model with this degree distribution. Co- 
hen et al. then applied the criterion of Molloy and Reed, 
Eq. Q28JI. to determine whether this network has a giant 
component. (One could also apply Eqs. i|29|) and (|30() 
to determine the size of the giant and non-giant compo- 
nents, although this is not done in Ref. l93h 

One of the most interesting conclusions of the work of 
Cohen et al. is for the case of networks with power-law de- 
gree distributions pk ~ k~ a for some constant a. When 
Oi < 3, they find that the critical value q c of q where the 
transition takes place at which a giant component forms 
is zero or negative, indicating that the network always 
has a giant component, or in the language of physics, 
the network always percolates. This echos the numerical 
results of Albert et al. [l^, who found that the connec- 
tivity of power-law networks was highly robust to the 
random removal of vertices. In general, the method of 
Cohen et al. indicates that q c < for any degree distri- 
bution with a diverging second moment. 

An alternative and more general approach to the per- 
colation problem on the configuration model has been 
put forward by Callaway et al. |81|. using a generaliza- 
tion of the generating function formalism discussed in 
Sec. IIV.B.1I In their method, the probability of occu- 
pation of a vertex can be any function of the degree k 
of that vertex. Thus the constant q of the approach of 
Cohen et al. is generalized to qk, the probability that a 
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vertex having degree k is occupied. One defines generat- 
ing functions 



Pkqkx , 



F 1 (x) 



Je-l 



k=0 



J2k k Pk 



(78) 



and it can then be shown that the probability distribution 
of the size of the component of occupied vertices to which 
a randomly chosen vertex belongs is generated by Hq (x) 
where 

H (x) = 1 - F (l) + xFoiH^x)), (79a) 
Hi{x) = 1-Fi(l ) + xF 1 (H 1 {x)) . (79b) 

(Note that Fq is not a properly normalized generating 
function in the sense that Fq(1) ^ 1.) From this one can 
derive an expression for the mean component size: 



^o(l) 



^(l)fi(l) 
1-^(1) ' 



(80) 



which immediately tells us that the phase transition at 
which a giant component forms takes place at = 1. 

The size of the giant component is given by 

S = F (l) - F (u), u=l-Ji(l) + F 1 (u). (81) 

For instance, in the case studied by Cohen et al. |9^| 
of uniform occupation probability q^ — q, this gives a 
critical occupation probability of q c = 1/G^(1), where 
G\(x) is the generating function for the degree distribu- 
tion itself, as defined in Eq. Il2.'j[| . Taking the example of 
a power-law degree distribution = k~ a /((a), Eq. (J22J), 
we find 



C(a-l) 



C(a-2)-C(a-l)" 



(82) 



This is negative (and hence unphysical) for a < 3, con- 
firming the finding that the system always percolates in 
this regime. Note that q c > 1 for sufficiently large a, 
which is also unphysical. One finds that the system 
never percolates for a > a c , where a c is the solution 
of C(a-2) = 2<C(a-l), which gives a c = 3.4788 .. . This 
corresponds to the point at which the underlying net- 
work itself ceases to have a giant component, as shown 
by Aiello et al. and discussed in Sec. IIV.B.1I 

The main advantage of the approach of Callaway et al. 
is that it allows us to remove vertices from the network 
in an order that depends on their degree. If, for instance, 
we set qk — 9{k — fc max ), where 9{x) is the Heaviside step 
function, then we remove all vertices with degrees greater 
than fc max . This corresponds precisely to the experiment 
of Broder et al. jzi| wno looked at the behavior of the 
World Wide Web graph as vertices were removed in order 
of decreasing degree. (Similar but not identical calcula- 
tions were also performed by Albert et al. |l5j-) In agree- 
ment with the numerical calculations (see Sec. IIILDfl . 
Callaway et al. find that networks with power-law de- 
gree distributions are highly susceptible to this type of 
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FIG. 14 The fraction of vertices that must be removed from 
a network to destroy the giant component, if the network has 
the form of a configuration model with a power-law degree 
distribution of exponent a, and vertices are removed in de- 
creasing order of their degrees. 



targeted attack; one need only remove a small percent- 
age of vertices to destroy the giant component entirely. 
Similar results were also found independently by Co- 
hen et al. |94 |, us ing a closely similar method, and in 
a later paper |362j some of the same authors extended 
their calculations to directed networks also, which show 
a considerably richer component structure, as described 
in Sec. llV.lj.3l 

As an example, consider Fig. O which shows the frac- 
tion of the highest degree vertices that must be removed 
from a network with a power-law degree distribution to 
destroy the giant compo nent, as a function of the expo- 
nent a of the power law in! lam. As the figure shows, 
the maximum fraction is less than three percent, and for 
most values of a the fraction is significantly less than this. 
This appears to imply that networks like the Internet 
and the Web that have power-law degree distributions 
are highly susceptible to such attacks [1^, U3, LtM ■ 

These results are for the configuration model. Other 
models offer some further insights. The finding by Co- 
hen et al. [9^| that the threshold value q c at which per- 
colation sets in for the configuration model is zero for 
degree distributions with a divergent second mome nt ha s 
attracted particular interest. Vazquez and Moreno |40J|, 
for example, have shown that the threshold may be zero 
even for finite second moment if the degrees of adja- 
cent vertices in the network are positively correlated 
(see Sees. IIII.FI and IIV.B.5(I . Conversely, if the sec- 
ond moment does diverge there may still be a non-zero 
threshold if th ere are negative degree correlations. War- 
ren et al. 408] have shown that there can also be a non- 
zero threshold for a network incorporating geographical 
effects, in which each vertex occupies a position in a low- 
dimensional space (typically two-dimensional) and prob- 
ability of connection is higher for vertex pairs that are 
close together in that space. A si milar spatial model has 
been studied by Rozenfeld et al. |359| . and both models 
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are closely related to continuum percolation 278]. 

An issue related to resilience to vertex deletion, is the 
issue of cascading failures. In some networks, such as 
electrical power networks, that carry load or distribute 
a resource, the operation of the network is such that the 
failure of one vertex or edge results in the redistribution 
of the load on that vertex or edge to other nearby ver- 
tices or edges. If vertices or edges fail when the load on 
them exceeds some maximum capacity, then this mech- 
anism can result in a cascading failure or avalanche in 
which the redistribution of load pushes a vertex or edge 
over its threshold and causes it to fail, leading to fur- 
ther redistribution. Such a cascading failure in the west- 
ern United States in August 1996 resulted in the spread 
of what was initially a small power outage in El Paso, 
Texas through six states as far as Oregon and Califor- 
nia, leaving se veral million electricity customers without 
power. Watts |413j] has given a simple model of this pro- 
cess that can be mapped onto a type of percolation model 
and hence can be solved using generating function meth- 
ods similar to those for simple vertex removal processes 
above. 

In Watts's model, a vertex i fails if a given fraction 
<j)i of its neighbors have failed, where the quantities {4>i} 
axe iid variables drawn from a distribution /(</>). The 
model is seeded by the initial failure of some non-zero 
density <i>o of vertices, chosen uniformly at random. It is 
assumed that <&o 1, so that the initial seed consists, 
to leading order, of single isolated vertices. Watts con- 
siders networks with the topology of the configuration 
model fSec. IIV.B.1|) . for which, because of the vanishing 
density of short loops making the networks tree-like at 
small length-scales, each vertex will have at most only 
a single failed neighboring vertex in the initial stages of 
the cascade, and hence will fail itself if and only if its 
threshold for failure satisfies <f> < 1/fc, where k is its de- 
gree. Watts calls vertices satisfying this criterion vul- 
nerable. The probability of a vertex being vulnerable is 
qk = J^ k f{4>) dej), and the cascade will spread only if 
such vertices connect to form a percolating (i.e., exten- 
sive) cluster on the network. Thus the problem maps 
directly onto the generalized percolation process studied 
by Callaway et al. H3 above, allowing us to find a condi- 
tion for the spread of the initial seed to give a large-scale 
cascade. The percolation model applies only to the vul- 
nerable vertices however, so to calculate the final sizes of 
cascades Watts performs numerical simulations. 

Models of casc ading failu re have also been s tudi e d by 
Holme and Kim |l95l flOgl b v Moreno et al. |298| 
and by Motter and Lai |305| . In the model of Holme 
and Kim, for instance, load on a vertex is quantified by 
the betweenness centrality of the vertex (see Sec. IIII . if) . 
and vertices fail when the betweenness exceeds a given 
threshold. Holme and Kim give simulation results for the 
avalanche size distribution in their model. 



B. Epidemiological processes 

One of the original, and still primary, reasons for 
studying networks is to understand the mechanisms by 
which diseases and other things (information, computer 
viruses, rumors) spread over them. For instance, the 
main re ason for the study of networks of s exual con- 
tact |4^ llMllMlinil24al2M l266 ll8Ml8R8| (Sec.[OJ 
is to help us understand and perhaps control the spread 

of sexually transmitted di sease s. Similarly one studies 

networks of email contact 136, 321] to learn how com- 
puter viruses spread. 



1. The SIR model 

The simplest model of the spread of a disease over a 
netw ork is the SIR model of epidemic disease [22, l26l 
Il92j] . 35 This model, first formulated, though never pub- 
lished, by Lowell Reed and Wade Hampton Frost in the 
1920s, divides the population into three classes: suscep- 
tible (S), meaning they don't have the disease of interest 
but can catch it if exposed to someone who does, infec- 
tive 36 (I) meaning they have the disease and can pass 
it on, and recovered (R), meaning they have recovered 
from the disease and have permanent immunity, so that 
they can never get it again or pass it on. (Some authors 
consider the R to stand for "removed," a general term 
that encompasses also the possibility that people may die 
of the disease and remove themselves from the infective 
pool in that fashion. Others consider the R to mean "re- 
fractory," which is the common term among those who 
study the closel y related area of reaction diffusion pro- 
cesses [3861142^ .^1 

In traditional mathematical epidemiology [23l 12^. fl92| . 
one then assumes that any susceptible individual has a 
uniform probability [3 per unit time of catching the dis- 
ease from any infective one and that infective individuals 
recover and become immune at some stochastically con- 
stant rate 7. The fractions s, i and r of individuals in 
the states S, I and R are then governed by the differential 
equations 

ds di dr 

Tt =-frs, 7», ^=7*. (83) 



Computer viruses are an interesting case in that the networks 
over which they spread are normall y dir ected, unlike the contact 
networks for most human diseases L'iii . 

One distinguishes between an epidemic disease such as influenza, 
which sweeps through the population rapidly and infects a signif- 
icant fraction of individuals in a short outbreak, and an endemic 
disease such as measles, which persists within the population at 
a level roughly constant over time. The SIR model is a model of 
the former. The SIS model discussed in Sec. IVIII.B.2I is a model 
of the latter. 

In everyday parlance the more common word is "infectious," but 
infective is the standard term among epidemiologists. 
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Models of this type are called fully mixed, and although 
they have taught us much about the basic dynamics of 
diseases, they are obviously unrealistic in their assump- 
tions. In reality diseases can only spread between those 
individuals who have actual physical contact of one sort 
or another, and the structure of the contact network is 
important to the pattern of development of the disease. 

The SIR model can be generalized in a straightfor- 
ward manner to an epidemic taking place on a network, 
although the resulting dynamical system is substantially 
more complicated than its fully mixed counterpart. The 
important observation that allo ws us to make progress, 
first made by Grassberger (l79| . is that the model can 
be mapped exactly onto bond percolation on the sam e 
network. Indeed, as pointed out by Sander et al. |36(1 ] . 
significantly more general models can also be mapped to 
percolation, in which transmission probability between 
pairs of individuals and the times for which individuals 
remain infective both vary, but are chosen in iid fashion 
from some appropriate distributions. Let us suppose that 
the distribution of infection rates (3, defined as the prob- 
ability per unit time that an infective individual will pass 
the disease onto a particular susceptible network neigh- 
bor, is drawn from a distribution Pi((3). And suppose 
that the recovery rate 7 is drawn from another distribu- 
tion P r (7) . Then the resulting model can be shown |315j 
to be equivalent to uniform bond percolation on the same 
network with edge occupation probability 



T= 1- 



F l (/3)P r ( 7 )e-' 3 /Td/3d7. (84) 



The extraction of predictions about epidemics from 
the percolation model is simple: the distribution of per- 
colation clusters (i.e., components connected by occu- 
pied edges) corresponds to the distribution of the sizes 
of disease outbreaks that start with a randomly chosen 
initial carrier, the percolation transition corresponds to 
the "epidemic threshold" of epidemiology, above which 
an epidemic outbreak is possible (i.e., one that infects a 
non-zero fraction of the population in the limit of large 
system size) , and the size of the giant component above 
this transition corresponds to the size of the epidemic. 
What the mapping cannot tell us, but standard epidemi- 
ological models can, is the time progression of a disease 
outbreak. The mapping gives us results only for the ul- 
timate outcome of the disease in the limit of long times, 
in which all individuals are in either the S or R states, 
and no new cases of the disease are occurring. Nonethe- 
less, there is much to be learned by studying even the 
non-time- varying properties of the model. 

The solution of bond percolation for the configuration 
model was given by Callaway et al. |8l|. who showed 
that, for uniform edge occupation probability T, the dis- 
tribution of the sizes of clusters (i.e., disease outbreaks 
in epidemiological language) is generated by the function 
Hq(x) where 

H Q (x) = xGoiH^x)), (85a) 
#1(2;) = 1 - T + TxGi{Hi{xj), (85b) 



where Gq(x) and G\(x) are defined in Eqs. (|33J). This 
gives an epidemic transition that takes place at T c — 
1/G[(1), a mean outbreak size (s) given by 



H' Q {l)=T 



1 



TG' (1) 
l-TGi(l) 



(86) 



and an epidemic outbreak that affects a fraction S of the 
network, where 



S = 1 - G (u), 



l-T + TG^u). 



(87) 



Similar solutions can be found for a wide variety of other 
model networks, including networks with correlations of 
various kin ds be tween the rates of infection or the infec- 
tivity times |315| , netw orks with correlations between the 
degrees of vertices |30l| , and networks with more comp lex 
structure, such as different types of vertices [2lll315| . 

One of the most important conclusions of this work 
is for the case of networks with power-law degree dis- 
tributions, for which, as in the case of site percolation 
(Sec. IVIII.A|) . there is no non-zero epidemic threshold 
so long as the exponent of the power law is less than 3. 
Since most power-law networks satisfy this condition, we 
expect diseases always to propagate in these networks, 
regardless of transmission probability between individu- 
als, a point that was first made, in the context of models 
of compute r virus ep idemiology, by Pastor-Satorras and 
Vespigna ni I333L 13361 . although, as pointed out by Lloyd 
and May |267l 12771] , precursors of the same r esult can be 
seen in earlier work of May and Anderson (27 6|. May 
and Anderson studied traditional (fully mixed) differen- 
tial equation models of epidemics, without network struc- 
ture, but they divided the population into activity classes 
with different values of the infection rate /3. They showed 
that the variation of the number of infective individuals 
over time depends on the variance of this rate over the 
classes, and in particular that the disease always multi- 
plies exponentially if the variance diverges — precisely the 
situation in a network with a power-law degree distribu- 
tion and exponent less than 3. 

The conclusion that diseases always spread on scale- 
free networks has been revised somewhat in the light of 
later discoveries. In particular, there may be a non-zero 
percolation threshold for certai n types of correlations be- 
tween vertices |H H3 |H IH Hoil EiS, if the network 
is embedded in a l ow-dimens ional (rather than infinite- 
dimensional ) spa ce |359i l408j , or if the network has high 
transitivity |l39j (see Sec. IIII.Bl) . 

An interesting combination of the ideas of epidemiol- 
ogy with those of network resilience explored in the pre- 
ceding section arises when one considers vaccination of 
a population against the spread of a disease. Vaccina- 
tion can be regarded as the removal from a network of 
some particular set of vertices, and this in turn can be 
modeled as a site percolation process. Thus one is led to 
consideration of joint site/bond percolation on networks, 
which has also been solved, in the simplest uniformly 
random case, by Callaway et al. [8l|. If the site per- 
colation is correlated with vertex degree (as in Eq. (|78|) 
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and following), for example removing the vertices with 
highest degree, then one has a model for targeted vacci- 
nation strategies also. A good discus sion has been given 
by Pastor-Satorras and Vespignani 335]. As with the 
models of Sec. IVIII.AI one finds that networks tend to be 
particularly vulnerable to removal of their highest degree 
vertices, so this kind of targeted vaccination is expected 
to be particularly effective. (This of course is not news 
to the public health community, who have long followed 
a policy of focusing their most aggressive disease pre- 
vention efforts on the "core communities" of high-degree 
vertices in a network.) 

Unfortunately, it is not always easy to find the highest 
degree vertices in a social network. The number of sex- 
ual contacts a person has had can normally only be found 
by asking them, and perhaps not even then. An inter- 
esting method that circumvents this problem has been 
suggested by Cohen et al. [§2. They observe that since 
the probability of reaching a particular vertex by follow- 
ing a randomly chosen edge in a graph is proportional to 
the vertex's degree fSec. lIV.B|) . one is more likely to find 
high-degree vertices by following edges than by choosing 
vertices at random. They propose thus that a population 
can be immunized by choosing a random person from 
that population and vaccinating a friend of that person, 
and then repeating the process. They show both by an- 
alytic calculations and by computer simulation that this 
strategy is substantially more effective than random vac- 
cination. In a sense, in fact, this st rateg y is already in 
use. The "contact tracing" methods |25lj used to control 
sexually transmit ted diseases, and the "ring vaccination" 
method |!8lll308fl used to control smallpox and foot-and- 
mouth disease are both examples of roughly this type of 
acquaintance vaccination. 



2. The SIS model 



The SIS model is a model of endemic disease. Since 
carriers can be infected many times, it is possible, and 
does happen in some parameter regimes, that the disease 
will persist indefinitely, circulating around the population 
and never dying out. The equivalent of the SIR epidemic 
transition is the phase boundary between the parameter 
regimes in which the disease persists and those in which 
it does not. 

The SIS model cannot be solved exactly on a net- 
work as the SIR model can, but a detailed mean-field 
treat ment has b een given by Pastor-Satorras and Vespig- 
nani |332l l333l | for SIS epidemics on the configuration 
model. Their approach is based on the differential equa- 
tions, Eq. ||HSJ, but they allow the rate of infection (3 
to vary between members of the population, rather than 
holding it constant. (Th is is similar to the approach of 
May and Anderson |276| | for the SIR model, discussed 
m Sec. IVlll.jj.il but is more general, since it does not 
involve the division of the population into a binned set 
of activity classes, as the May- Anderson approach does.) 
The calculation proceeds as follows. 

The quantity fii appearing in l|88|l represents the av- 
erage rate at which susceptible individuals become in- 
fected by their neighbors. For a vertex of degree k, 
Pastor-Satorras and Vespignani make the replacement 
/3i — ► fcA9(A), where A is the rate of infection via con- 
tact with a single infective individual and 0(A) is the 
probability that the neighbor at the other end of an edge 
will in fact be infective. Note that 9 is a function of A 
since presumably the probability of being infective will 
increase as the probability of passing on the disease in- 
creases. The remaining occurrences of the variables s and 
i Pastor-Satorras and Vespignani replace by Sk and ik, 
which are degree-dependent generalizations representing 
the fraction of vertices of degree k that are susceptible or 
infective. Then, noticing that ik and s& obey ik + Sfe = 1 , 
we can rewrite (|88H as the single differential equation 



Not all diseases confer immunity on their survivors. 
Diseases that, for instance, are not self-limiting but can 
be cured by medicine, can usually be caught again imme- 
diately by an unlucky patient. Tuberculosis and gonor- 
rhea are two much-studied examples. Computer viruses 
also fall into this category; they can be "cured" by anti- 
virus software, but without a permanent virus-checking 
program the computer has no way to fend off subsequent 
attacks by the same virus. 

With diseases of this kind carriers that are cured move 
from the infective pool not to a recovered pool, but back 
into the susceptible one. A model with this type of dy- 
namics is called an SIS model, for obvious reasons. In 
the simplest, fully mixed, single-population case, its dy- 
namics are described by the differential equations 



— = -pis + yi, — = (lis 
at at 



7z, 



(88) 



where and 7 are, as before, the infection and recovery 
rates. 



d4 
dt 



fcA0(A)(l -»*)-»*, 



(89) 



where we have, without loss of generality, set the recovery 
rate 7 equal to 1. There is an approximation inherent 
in this formulation, since we have assumed that 9(A) 
is the same for all vertices, when in general it too will 
be dependent on vertex degree. This is in the nature 
of a mean-field approximation, and can be expected to 
give a reasonable guide to the qualitative behavior of the 
system, although certain properties (particularly close to 
the phase transition) may be quantitatively mispredicted. 
Looking for stationary solutions, we find 



ik = 



fcA9(A) 
l + fcA9(A)' 



(90) 



To calculate the value of 9(A), one averages the proba- 
bility ik of being infected over all vertices. Since 9(A) 
is defined as the probability that the vertex at the end 
of an edge is infective, ik should be averaged over the 
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distribution kpk/z of the degrees of such vertices (see 
Sec. UV.B.lfl . where z = ^ fc fcpfc is, as usual, the mean 
degree. Thus 



6(A) = - ^kpkik. 



(91) 



Eliminating from Eqs. (|89|l and (|91|l we then obtain 
an implicit expression for 0(A): 



A \ - k 2 p k _ 
z ^ l + fcAO(A) ~ ' 



(92) 



For particular choices of pk this equation can be solved 
for 0(A) either exactly or approximately. For instance, 
for a power-law degree distribution of the form (|32[l . 
Pastor-Satorras and Vespignani solve it by making an in- 
tegral approximation, and hence show that there is no 
non-zero epidemic threshold for the SIS model in the 
power-law case — the disease will always persist, regard- 
less of the value of the infection rate parameter A |333| . 
They have also generalized the solution to a num ber o f 
other cases, includin g ot her degree distributions |332j | . 
finite-sized networks |334j . and models th at include vac- 
cination of some fraction of individuals |335l 1.336) . In 
the latter case, they tackle both random vaccination and 
vaccination targeted at the vertices with highest deg ree 
using a method similar to that of Cohen et al. [93j in 
which they calculate the effective degree distribution of 
the network after the removal of a given set of vertices 
and then apply their mean-field method to the resulting 
network. As we would expect from the results of Co- 
hen et al, propagation of the disease turns out to be rela- 
tively robust against random vaccination, at least in net- 
works with right-skewed degree distributions, but highly 
susceptible to vaccination of the highest-degree individ- 
uals. The mean- field method has also been applied to 
networks with degree correlations of the type discussed 
in Sec. IIII.FI by Boguha et al. jEF 



Of particular note is 
their finding that for the case of power-law degree distri- 
butions neither assortative nor disassortative mixing by 
degree can produce a non-zero epidemic threshold in the 
SIS model, at least within the mean-field approximation. 
This contrasts with the case for the SIR model, where 
it was found that disassortative mixing can produce a 
non-zero threshold l40Cl. 



The me an- fie ld method can also be applied to the SIR 
model 24, 299]. Although we have an exact solution for 
the SIR model as described in Sec. rVHLB.il that solu- 
tion can only tell us about the long-time behavior of an 
outbreak — its expected final size and so forth. The mean- 
field method, although approximate, can tell us about 
the time evolution of an outbreak, so the two methods 
are complementary. The mean-field method for the SIR 
model can also be used to treat approximately the effects 
of network transitivity pl ll5ll22H235| . 



C. Search on networks 

Another example of a process taking place on a net- 
work that has important practical applications is network 
search. Suppose some resource of interest is stored at the 
vertices of a network, such as information on Web pages, 
or computer files on a distributed database or file-sharing 
network. One would like to determine rapidly where on 
the network a particular item of interest can be found 
(or determine that it is not on the network at all). One 
way of doing this, which is used by Web search engines, 
is simply to catalog exhaustively (or "crawl" ) the en- 
tire network, creating a distilled local map of the data 
found. Such a strategy is favored in cases where there 
is a heavy communication cost to searching the network 
in real time, so that it makes sense to create a local in- 
dex. While performing a network crawl is, in principle, 
straightforward (although in practice it may be techni- 
cally very challenging [72|), there are nonetheless some 
interesting theoretical questions arising. 



1. Exhaustive network search 

One of the triumphs of recent work on networks has 
been the development of effective algorithms for mining 
network crawl data for information of interest, particu- 
larly in the context of the World Wide Web. The im- 
portant trick here turns out to be to use the information 
contained in the edges of the network as well as in the 
vertices. Since the edges, or hyperlinks, in the World 
Wide Web are created by people in order to highlight 
connections between the contents of pairs of pages, their 
structure contains information about page content and 
relevance which can help us to improve search perfor- 
mance. The good search engines therefore make a local 
catalog not only of the contents of web pages, but also 
of which ones link to which others. Then when a query 
is made of the database, usually in the form of a tex- 
tual string of interest, the typical strategy would be to 
select a subset of pages from the database by searching 
for that string, and then to rank the results using the 
edge informa tion. The classic algorithm, due to Brin 
and Page [7^, l328| , is essentially identical in its simplest 
form to the ei genv ect or central ity long used in social net- 
work analysis |6o . 1671 13631 l409j | . Each vertex i is assigned 
a weight Xi > 0, which is defined to be proportional to 
the sum of the weights of all vertices that point to i: 



A- X E 



3 " 



for some A > 0, or in matrix form 



Ax = Ax, 



(93) 



where A is the (asymmetric) adjacency matrix of the 
graph, whose elements are Ay , and x is the vector whose 
elements are the Xi. This of course means that the 
weights we want are an eigenvector of the adjacency ma- 
trix with eigenvalue A and, provided the network is con- 
nected (there are no separate components), the Perron- 
Frobenius theorem then tells us that there is only one 
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eigenvector with all weights non-negative, which is the 
unique eigenvector corresponding to the largest eigen- 
value. This eigenvector can be found trivially by re- 
peated multiplication of the adjacency matrix into any 
initial non-zero vector which is not itself an eigenvector. 

This algorithm, which is implemented (along with 
many additional tricks) in the widely used search engine 
Google, appears to be highly effective. In essence the al- 
gorithm makes the assumption that a page is important 
if it is pointed to by other important pages. A more so- 
phisticated ve rsion of th e same idea has been put forward 
by Kleinberg |236l l237t | . who notes that, since the Web 
is a directed network, one can ask not only about which 
vertices point to a vertex of interest, but also about which 
vertices are pointed to by that vertex. This then leads 
to two different weights Xi and yi for each vertex. Klein- 
berg refers to a vertex that is pointed to by highly ranked 
vertices as an authority — it is likely to contain relevant 
information. Such a vertex gets a weight Xj, that is large. 
A vertex that points to highly ranked vertices is referred 
to as a hub; while it may not contain directly relevant 
information, it can tell you where to find such informa- 
tion. It gets a weight yi that is large. (Certainly it is 
possible for a vertex to have both weights large; there is 
no reason why the same page cannot be both a hub and 
an authority.) The appropriate generalization of Eq. I|93|) 
for the two weights is then 

Ay = Ax, A T x = /xy, (94) 

where A T is the transpose of A. Most often we are in- 
terested in the authority weights which, eliminating y, 
obey AA T x = A/xx, so that the primary difference be- 
tween the method of Brin and Page and the method 
of Kleinberg is the replacement of the adjacency matrix 
with the symmetric product AA T . More general forms 
than are also possible. One could for example allow 
the authority weight of a vertex to depend on the author- 
ity weights of the vertices that point to it (and not just 
their hub weights, as in Eq. (|94[l ). This leads to a model 
that interpolates smoothly between the Brin-Page and 
Kleinberg methods. As far as we are aware however, this 
has not been tried. Neither has Kleinberg's method been 
implemented yet in a commercial web search engine, to 
the best of our knowledge. 

The methods described here can also be used for search 
on other directed information networks. Kleinberg's 
method is be particularly suitable for ranking publica- 
tions in citation networks, for example. The Citeseer lit- 
erature search engine implements a form of article rank- 
ing of this type. 

2. Guided network search 

An alternative approach to searching a network is to 
perform a guided search. Guided search strategies may 
be appropriate for certain kinds of Web search, particu- 
larly searches for specialized content that could be missed 



by generic search engines (whose coverage tends to be 
quite poor), and also for searching on other types of net- 
works such as distributed databases. Exhaustive search 
of the type discussed in the preceding section crawls a 
network once to create an index of the data found, which 
is then stored and searched locally. Guided searches per- 
form small special-purpose crawls for every search query, 
crawling only a small fraction of the network, but doing 
so in an intelligent fashion that deliberately seeks out the 
network vertices most likely to contain relevant informa- 
tion. 

One practical example of a guided search is t he special - 
ized Web crawler or "spider" of Menczer et al. |280tl28l| . 
This is a program that performs a Web crawl to find re- 
sults for a particular quer y. The method used is a type 
of genetic algorithm |285j | or enrichment method |180| | 
that in its simplest form has a number of "agents" that 
start crawling the Web at random, looking for pages that 
contain, for example, particular words or sets of words 
given by the user. Agents are ranked according to their 
success at finding matches to the words of interest and 
those that are least successful are killed off. Those that 
are most successful are duplicated so that the density 
of agents will be high in regions of the Web graph that 
contain many pages that look promising. After some 
specified amount of time has passed, the search is halted 
and a list of the most promising pages found so far is 
presented to the user. The method relies for its success 
on the assumption that pages that contain information 
on a particular topic tend to be clustered together in lo- 
cal regions of the graph. Other than this however, the 
algorithm makes little use of statistical properties of the 
structure of the graph. 

Adamic et al. 0, [|| have given a completely different 
algorithm that directly exploits network structure and 
is designed for use on peer-to-peer networks. Their algo- 
rithm makes use of the skewed degree distribution of most 
networks to find the desired results quickly. It works as 
follows. 

Simple breadth-first search can be thought of as a 
query that starts from a single source vertex on a net- 
work. The query goes out to all neighbors of the source 
vertex and says, "Have you got the information I am 
looking for?" Each neighbor either replies "Yes, I have 
it," in which case the search is over, or "No, I don't, but 
I have forwarded your request to all of my neighbors." 
Each of their neighbors, when they receive the request, 
either recognizes it as one they have seen before, in which 
case they discard it, or they repeat the process as above. 
A query of this kind takes aggregate effort 0(n) in the 
network size. Adamic et al. propose to modify this algo- 
rithm as follows. The initial source vertex again queries 
each of its neighbors for the desired information. But 
now the reply is either "Yes, I have it" or "No, I don't, 
and I have k neighbors," where k is the degree of the ver- 
tex in question. Upon receiving replies of the latter type 
from each of its neighbors, the source vertex finds which 
of its neighbors has the highest value of k and passes 
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the responsibility for the query like a runner's baton to 
that neighbor, who then repeats the entire process with 
their neighbors. (If the highest-degree vertex has already 
handled the query in the past, then the second highest 
is chosen, and so forth; complete recursive back-tracking 
is used to make sure the algorithm never gets stuck in a 
dead end.) 

The upshot of this strategy is that the baton gets 
passed rapidly up a chain of increasing vertex degree 
until it reaches the highest degree vertices in the net- 
work. On networks with highly skewed degree distribu- 
tions, particularly scale-free (i.e., power-law) networks, 
the neighbors of the high-degree vertices account for a 
significant fraction of all the vertices in the network. On 
average therefore, we need only go a few steps along 
the chain before we find a vertex with a neighbor that 
has the information we are looking for. The maximum 
degree on a scale-free network scales with network size 
as n x /( a ~~ (see Sec. IIII.C.2|) . and hence the number 
of steps required to search 0(n) vertices is of order 
n / n i/(a-i) _ n (o-2)/(a-i) ) whicn lieg b etween 0(n 1 / 2 ) 

and O(logrt) for 2 < a < 3, which is the range gener- 
ally observed in power-law networks (see Tabic [Hjl. This 
is a significant improvement over the O(n) of the sim- 
ple breadth-first search, especially for the smaller values 
of a. 

This result differs from that given by Adamic et al. 0, 
, who adopted the more conservative assumption that 
the maximum degree goes as n 1//a which gives signifi- 
cantly poorer search times between 0(n 2 / 3 ) and 0(n 1 / 2 ). 
They point out however that if each vertex to which the 
baton passes is allowed to query not only its immediate 
network neighbors but also its second neighbors, then the 
performance improves markedly to 0(7j 2 ( 1 ~ 2 / Q - ) ). 

The algorithm of Adamic et al. has been tested numer- 
ically on graphs with the structure of the configuration 
model (Sec. IIV.B.1|) an d the Barabasi- Albert prefer- 
ential attachment model 5, 232] fSec. lVILB|l . and shows 
behavior in reasonable agreement with the expected scal- 
ing forms. 

The reader might be forgiven for feeling that these al- 
gorithms are cheating a little, since the running time of 
the algorithm is measured by the number of hands the 
baton passes through. If one measures it in terms of the 
number of queries that must be responded to by network 
vertices, then the algorithm is still O(n), just as the sim- 
ple breadth-first search is. Adamic et al. suggest that 
each vertex therefore keep a local directory or index of 
the information (such as data files) stored at neighboring 
vertices, so that queries concerning those vertices can be 
resolved locally. For distributed databases and file shar- 
ing networks, where bandwidth, in terms of communi- 
cation overhead between vertices, is the costly resource, 
this strategy really does improve scaling with network 
size, reducing overhead per query to O(logn) in the best 
case. 



3. Network navigation 

The work of Adamic et al. [fj [|| discussed in the pre- 
ceding section considers how one can design a network 
search algorithm to exploit statistical features of network 
structure to improve performance. A com plem e ntary 
question has been considered by Kleinberg 238l l239j : 
Can one design network structures to make a particu- 
lar search algorithm perform well? Kleinberg's work is 
motivated by the observation, discussed in Sec. IIII.HI 
that people are able to navigate social networks effi- 
ciently with only local information about network struc- 
ture. Furthermore, this ability does not appear to de- 
pend on any particularly sophisticated behavior on the 
part of the peop le. When performing the letter-passing 
task of Milgram |283t l393| . for instance, in which partic- 
ipants are asked to communicate a letter or message to a 
designated target person by passing it through their ac- 
quaintance network (Sec. III.A"|l . the search for the target 
is performed, roughly speaking, using a simple "greedy 
algorithm." That is, at each step along the way the letter 
is passed to the person that the current holder believes 
to be closest to the target. (This in fact is precisely how 
participants were instructed to act in Milgram's experi- 
ments.) The fact that the letter often reaches the target 
in only a short time then indicates that the network it- 
self must have some special properties, since the search 
algorithm clearly doesn't. 

Kleinberg suggested a simple model that illustrates 
this behavior. His model is a va riant of th e small- world 
model of Watts and Strogatz HH Ell] (Sec. |VTJ> in 
which shortcuts are added between pairs of sites on a 
regular lattice (a square lattice in Kleinberg's studies). 
Rather than adding these shortcuts uniformly at random 
as Watts and Strogatz proposed, Kleinberg adds them 
in a biased fashion, with shortcuts more likely to fall be- 
tween lattice sites that are close together in the Euclidean 
space defined by the lattice. The probability of a short- 
cut falling between two sites goes as r~ a , where r is the 
distance between the sites and a is a constant. Kleinberg 
proves a lower bound on the mean time t (i.e., number of 
steps) taken by the greedy algorithm to find a randomly 
chosen target on such a network. His bound is t > cnP 
where c is independent of n and 

= f (2 - a)/3 for < a < 2 

\(a-2)/(a-l) for a > 2. 

Thus the best performance of the algorithm is when a is 
close to 2, and precisely at a = 2 the greedy algorithm 
should be capable of finding the target in O(logn) steps. 
Kleinberg also gave computer simulation results confirm- 
ing this result. More generally, for networks built on an 
underlying lattice in d dimensions, the opti mal perfor - 
mance of the g reed y algorithm occurs at a — d |238tl239| . 
(See also Ref. Il93l for some rigorous results on the per- 
formance of greedy algorithms on Watts-Strogatz type 
networks.) 
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V 

groups of individuals 

FIG. 15 The hier archical "social dist ance" tree proposed by 
Watts et al. [415| and by Kleinberg |240j| . Individuals are 
grouped together by occupation, location, interest, etc., and 
then those groups are grouped together into bigger groups 
and so forth. The social distance between two individuals 
is measured by how far one must go up the tree to find the 
lowest "common ancestor" of the pair. 



Kleinberg's work shows that many networks do not al- 
low fast search using a simple algorithm such as a greedy 
algorithm, but that it is possible to design networks that 
do allow such fast search. The particular model he stud- 
ies however is quite specialized, and certainly not a good 
representation of the real social networks that inspired his 
investigations. An alternative model that shows similar 
behavior to Kleinberg's, but which may shed more light 
on the true structure of so cial networks, has been pro- 
posed byWatts et al. |415j | and independently by Klein- 
berg |24f| . The "index" experiments of Killworth and 
Bernard [5ft l23fij indicate that people in fact navigate 
social networks by looking for common features between 
their acquaintances and the target, such as geograph- 
ical location or occupation. This suggests a model in 
which individuals are grouped (at least in the partici- 
pants minds) into categories according, for instance, to 
their jobs. These categories may then themselves be 
grouped in to supercategories, and so forth, creating a 
tree-like hierarchy of organization that defines a "social 
distance" between any two people: the social distance be- 
tween two individuals is measured by the height of lowest 
level in tree at which the two are connected — see Fig. [TBI 

The tree however is not the network, it is merely a 
mental construct that affects the way the network grows. 
It is assumed that the probability of their being an edge 
between two vertices is greater the shorter the social dis- 
tance between thos e vertices, and both Watts et al. |415j 
and Kleinberg j24f)l | assumed that this probability falls off 
exponentially with social distance. The greedy algorithm 
for communicating a message to a target person then 
specifies that the message should at each step be passed 
to that network neighbor of the current holder who has 
the shortest social distance to the target. Watts et al. 
showed by computer simulation that such an algorithm 
performs well over a broad range of parameters of the 
model, and Kleinberg showed that for appropriate pa- 
rameter choices the search can be completed in time 



again O(logn). 

While this model is primarily a m odel o f search on so- 
cial networks (or possibly the Web 240j ) , Watts et al. 
also suggested that it could be used as a model for de- 
signed networks. If one could arrange for items in a dis- 
tributed database to be grouped hierarchically according 
to some identifiable characteristics, then a greedy algo- 
rithm that is aware of those characteristics should be 
able to find a desired element in the database quickly, 
possibly in time only logarithmic in the size of the 
database. This idea has been studied in more detail by 
Iamnitchi et al. |205j and Arenas et al. psl ]. 

One disadvantage of the hierarchical organizational 
model is that in reality the categories into which network 
vertices fall almost certainly overlap, whereas in the hier- 
archical model they are disjoint. Kleinberg has proposed 
a generalization of the model that allows for overlapping 
categories and shows search behav ior qualitatively simi- 
lar to the hierarchical model |24fJ. 



D. Phase transitions on networks 

Another group of papers has dealt with the behavior 
on networks of traditional statistical mechanical models 
that show phase transitions. For example, several au- 
thors have studied spin models such as the Ising model 
on networks of various kinds. Barrat and Weigt |4fJ stud- 
ied the Ising model on n etworks with the topology of the 
small- world model |416| (see Sec. I Vlfl using replica meth- 
ods. They found, unsurprisingly, that in the limit n — > oo 
the model has a finite-temperature transition for all val- 
ues of the shortcut density p > 0. Further results for 
Ising models on sm a ll-world n etworks can be found in 
Refs. EH El HE! Eal EH and the model has also 
been studied on random graphs |ll2l l264j and on net- 
works with the top olog y of the Barabasi- Albert growing 
network model (lalHlf fSec IVII.BI) . 

The motivation behind studies of spin models on 
networks is usually either that they can be regarded 
as sim ple m odels of opinion formation in social net- 
works |426| or that they provide general insight into 
the effects of network topology on phase transition pro- 
cesses. There are however other more direct approaches 
to both of these issues. Opinion formation can be stud- 
ied mo re d i rectly using actual opinion formation mod- 
els I^ ITMITMI^I^I^ . And Goltsev et al. fn^ 
have examined phase transition behavior on networks 
using the general framework known as Landau theory. 
They find that the critical behavior of models on a net- 
work depends in general on the degree distribution, and 
is in particular strongly affected by power-law degree dis- 
tributions. 

One class of networked systems showing a phase tran- 
sition that is of real interest is the class of NP-hard com- 
putational problems such as satisfiability and colorability 
that show solvability transitions. The simplest example 
of such a system is the colorability problem, which is re- 
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lated to problems in operations research such as schedul- 
ing problems and also to the Potts model of statistical 
mechanics. In this problem a number of items (vertices) 
are divided into a number of groups (colors). Some pairs 
of vertices cannot be in the same group. Such a con- 
straint is represented by placing an edge between those 
vertices, so that the set of all constraints forms a graph. 
A solution to the problem of satisfying all constraints si- 
multaneously (if a solution exists) is then equivalent to 
finding a coloring of the graph such that no two adja- 
cent vertices have the same color. Problems of this type 
are found to show a phase transition between a region of 
low graph density (low ratio of edges to vertices) in which 
most graphs are colorable, to one of high density in which 
most are not. A considerable amount of work has been 
carried out on this and similar problems in the computer 
science community |l3lj |. However, this work has primar- 
ily been restricted to Poisson random graphs; it is largely 
an open question how the results will change whe n we 
look at more realistic network topologies. Walsh |406) 
has looked at colorability in the Watts-Strogatz small- 
world model (Sec. IVIjl . and found that these networks 
are easily colorable for both small and large values of the 
shortcut density parameter p, but harder to c olor in in- 
termediate regimes. Vazquez and Weigt 402] examined 
the related problem of vertex covers and found that on 
generalized random graphs solutions are harder to find 
for networks with strong degree correlations of the type 
discussed in Sec. IIII.FI 



E. Other processes on networks 

Preliminary investigations, primarily numerical in na- 
ture, have been carried out of the behavior of various 
other processes on networks. A number of authors have 
looked at diffusion processes. Random walks , for exam- 
ple, have been exa mine d by Jespersen et al. l21fi . Pan- 
dit and Amritkar |329| and Lahtinen et al. |258l |259|| . 
Solutions of the diffusion equation can be expressed as 
linear combinations of eigenvectors of the graph Lapla- 
cian, which has led a number of authors t o inv e stigate th e 
Laplacian and its eigenvalue spectrum |l50i Il73l l289| . 
Discrete dynamical processes have also attracted some 
attention. One of the earliest examples of a statisti- 
cal model of a networked system falls in this c ateg ory, 
the random B o olean net of Kauffman 0, 0, l97l l98l 
EM iM HM EH Hzl, which i s a m odel of a ge- 
netic regulatory network (see Sec. III.Dj) . Cellular au- 
tomata on ne tworks ha ve been investigated by Watts 
and Strogatz |412t l416j . and voter models and models 
of opinion formation can also be regarded as cellular au- 
tomata |84l I256L l403| . Iterated games o n network s hav e 
been investigated by several authors Q, Il35l l23li l41fj| , 
and some interesting differences are seen between be- 
havior on networks and on regular lattices. Other top- 
ics of inves tigat i on h ave included weakl y coupled oscil- 
lators HI HoJ EHI, neural networks |257l I382 J . and 



self- organized critical models |l06t I252L [300] . A useful 
discussion of the behavior of dyna mica l systems on net- 
works has been given by Strogatz [387j| . 



IX. SUMMARY AND DIRECTIONS FOR FUTURE 
RESEARCH 

In this article we have reviewed sonic recent work on 
the structure and function of networked systems. Work 
in this area has been motivated to a high degree by em- 
pirical studies of real-world networks such as the Inter- 
net, the World Wide Web, social networks, collaboration 
networks, citation networks, and a variety of biological 
networks. We have reviewed these empirical studies in 
Sees. HTl and IIIII focusing on a number of statistical prop- 
erties of networks that have received particular attention, 
including path lengths, degree distributions, clustering, 
and resilience. Quantitative measurements for a vari- 
ety of networks are summarized in Table [HI The most 
important observation to come out of studies such as 
these is that networks are generally very far from ran- 
dom. They have highly distinctive statistical signatures, 
some of which, such as high clustering coefficients and 
highly skewed degree distributions, are common to net- 
works of a wide variety of types. 

Inspired by these observations many researchers have 
proposed models of networks that typically seek to ex- 
plain either how networks come to have the observed 
structure, or what the expected effects of that struc- 
ture will be. The largest portion of this review has been 
taken up with discussion of these models, covering ran- 
dom graph models and their generalizations (Sec. I1V(I . 
Markov graphs (Sec.EJ, the small-world model CSec. lVI|l . 
and models of network growth, particularly the preferen- 
tial attachment models (Sec. IVlf|) . 

In the last part of this review (Sec. IVIIIj) we have dis- 
cussed work on the behavior of processes that take place 
on networks. The notable successes in this area so far 
have been studies of the spread of infection over networks 
such as social networks or computer networks, and stud- 
ies of the effect of the failure of network nodes on per- 
formance of communications networks. Some progress 
has also be made on phase transitions on networks and 
on dynamical systems on networks, particularly discrete 
dynamical systems. 

In looking forward to future developments in this area 
it is clear that there is much to be done. The study of 
complex networks is still in its infancy. Several general 
areas stand out as promising for future research. First, 
while we are beginning to understand some of the pat- 
terns and statistical regularities in the structure of real- 
world networks, our techniques for analyzing networks 
are at present no more than a grab-bag of miscellaneous 
and largely unrelated tools. We do not yet, as we do in 
some other fields, have a systematic program for charac- 
terizing network structure. We count triangles on net- 
works or measure degree sequences, but we have no idea 
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if these are the only important quantities to measure (al- 
most certainly they are not) or even if they are the most 
important. We have as yet no theoretical framework to 
tell us if we are even looking in the right place. Per- 
haps there are other measures, so far un-thought-of, that 
are more important than those we have at present. A 
true understanding of which properties of networks are 
the important ones to focus on will almost certainly re- 
quire us to state first what questions we are interested 
in answering about a particular network. And knowing 
how to tie the answers to these questions to structural 
properties of the network is therefore also an important 
goal. 

Second, there is much to be done in developing more 
sophisticated models of networks, both to help us un- 
derstand network topology and to act as a substrate for 
the study of processes taking place on networks. While 
some network properties, such as degree distributions, 



have been thoroughly modeled and their causes and ef- 
fects well understood, others such as correlations, tran- 
sitivity, and community structure have not. It seems 
certain that these properties will affect the behavior of 
networked systems substantially, so our current lack of 
suitable techniques to handle them leaves a large gap in 
our understanding. 

Which leads us to our third and perhaps most im- 
portant direction for future study, the behavior of pro- 
cesses taking place on networks. The work described in 
Sec. IVIlil represents only a few first attempts at answer- 
ing questions about such processes, and yet this, in a 
sense, is our ultimate goal in this field: to understand 
the behavior and function of the networked systems we 
see around us. If we can gain such understanding, it 
will give us new insight into a vast array of complex and 
previously poorly understood phenomena. 
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